Skip to content

Replace global shading with classloader-realm isolation for plugins#18459

Open
gortiz wants to merge 20 commits into
apache:masterfrom
gortiz:pinot-no-shading
Open

Replace global shading with classloader-realm isolation for plugins#18459
gortiz wants to merge 20 commits into
apache:masterfrom
gortiz:pinot-no-shading

Conversation

@gortiz
Copy link
Copy Markdown
Contributor

@gortiz gortiz commented May 11, 2026

Summary

Apache Pinot has all the ingredients of a clean plugin architecture — PluginManager with ClassRealm isolation, a pinot-plugin.properties activation file, a separate plugins/ directory in the distribution — but the build and runtime contradicted that design. This PR replaces the current "shade everywhere" strategy with classloader-realm-based isolation, keeping shading only where it has a real purpose (driver libraries and external-process connectors).

Problems addressed

  1. Plugin code was baked into the main service jar. pinot-distribution's shade pulled in plugin modules transitively via pinot-tools, so pinot-all-VERSION.jar contained all plugin classes with relocations applied.

  2. The startup script put every plugin jar onto the JVM classpath. appAssemblerScriptTemplate walked plugins/**/*.jar and appended them to CLASSPATH. PluginManager/ClassRealm ran, but its isolation was a no-op because the parent classloader already had the same classes.

  3. All plugin shading used the same prefix as the distribution (org.apache.pinot.shaded.*). Two plugins shading to the same prefix collided with each other; a plugin shading to the same prefix as the distribution had its own bytecode overridden by the distribution's classes.

  4. Extension-point discovery was fragmented. ServiceLoader.load(Iface), Class.forName(FQCN), and a reflection scan for @MetricsFactory all used the system classloader, so they were invisible to plugin realms.

What changed

Phase 1 — Discovery mechanisms made realm-aware

  • PluginManager.loadServices(Class<T>): new helper that walks all plugin realms and the boot classloader. Replaces 9 ServiceLoader.load(X.class) call sites across pinot-segment-spi, pinot-spi, pinot-query-runtime, pinot-core, pinot-common, pinot-broker, pinot-clients, and pinot-plugins/pinot-metrics.
  • AccessControlFactory: replaced Class.forName(FQCN) with PluginManager.get().loadClass(FQCN).
  • PinotMetricUtils: replaced @MetricsFactory reflection scan with PluginManager.get().loadServices(PinotMetricsFactory.class).
  • pinot-tools: depends on pinot-avro-base (library) instead of pinot-avro (plugin) so the Avro base classes are on the classpath without pulling in the full plugin.

Phase 2 — Every plugin is a realm

  • Added pinot-plugin.properties to every plugin module under pinot-plugins/ and pinot-connectors/ that didn't already have one.
  • PluginManager: added realm-walk fallback for createInstance(FQCN), jar-resident properties discovery, and a strong-dep stop list.

Phase 3 — Plugin-zip layout replaces shaded fat-jars

  • Removed org.apache.pinot.shaded.* relocations from the global shade config in the root pom.xml.
  • Switched pinot-distribution from shaded plugin fat-jars to plugin-zip layout (classes + dep jars side-by-side, no relocation).
  • Hoisted shared Avro, JSON, Parquet, and Protobuf utilities into -base library modules so plugins don't have to bundle duplicates.
  • Simplified plugin-zip directory layout: dropped the intermediate plugin.type level.

Bug fixes (verified against the new distribution)

  • PluginManager.loadClassFromAnyPlugin: added catch (NoClassDefFoundError) in the realm walk — previously this error escaped the catch block and aborted the entire walk; now it is logged at DEBUG and the walk continues to the next classloader.
  • InputFormatCheck: fixed wrong FQCN for pinot-confluent-protobuf (removed spurious .confluent. sub-package).
  • pinot-hdfs: use ${hadoop.dependencies.scope} (compile by default) so hadoop-common is bundled in the plugin-zip assembly.
  • pinot-distribution: explicit shade <executions> block with hardcoded package phase — the ${shade.phase.prop} property was not overriding the inherited root-pom execution's phase.
  • pinot-thrift: remove explicit provided scope on libthrift so it is bundled in the plugin-zip.
  • Deleted pinot-plugin.properties from pinot-batch-ingestion-spark-3 and pinot-batch-ingestion-hadoop — these run inside external processes (Spark, Hadoop MR) and must not be loaded as Pinot realms.

New tooling

  • pinot-plugin-verifier: standalone CLI tool that loads every known plugin through PluginManager and reports pass/fail. Achieves 21/21 passes on the built distribution.

Where shading is kept

Module Why
pinot-clients/pinot-java-client, pinot-clients/pinot-jdbc-client User-embedded driver libraries — relocation keeps our jackson/guava off the user's classpath
pinot-connectors/pinot-spark-3-connector Runs inside Spark's process — relocation avoids classpath conflict with Spark's own jackson/guava

Backward compatibility

  • pinot-spi bytecode level (Java 11) and dependency surface are unchanged.
  • Wire protocols and serialization formats are untouched.
  • Third-party plugins using PluginManager.createInstance(FQCN) continue to work — drop the plugin directory into plugins/<name>/ with a pinot-plugin.properties next to it.
  • Third-party plugins using ServiceLoader-based extension points continue to work as long as they ship META-INF/services/<iface-fqcn> (the standard @AutoService output).
  • The @MetricsFactory annotation is preserved but is no longer load-bearing; plugins must add @AutoService(PinotMetricsFactory.class) (or hand-write a service file) to be discovered.

Test plan

  • pinot-plugin-verifier reports 21/21 PASS on the built distribution (./mvnw install -DskipTests && java -jar pinot-plugin-verifier/target/pinot-plugin-verifier-*-jar-with-dependencies.jar --plugins-dir build/plugins)
  • PluginManagerTest#testRealmWalkContinuesPastNoClassDefFoundError — regression test for the NCDFE walk-continuation fix
  • PluginManagerTest#testLoadServicesFindsImplementationsViaServiceLoader — realm-aware service discovery
  • Existing PluginManagerTest suite passes
  • quick-start-batch.sh runs end-to-end from the built binary distribution

🤖 Generated with Claude Code

gortiz and others added 18 commits May 12, 2026 16:29
Plain `ServiceLoader.load(iface)` only consults the thread-context classloader and
silently misses SPI implementations that live in plugin realms. Add a `loadServices`
helper on `PluginManager` that walks the manager's classloader, every Plexus
`ClassRealm`, and every legacy `PluginClassLoader`, deduplicating by FQCN.

Pure addition — existing call sites are unchanged. Refs apache#18386.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Match the per-entry skip-on-error behaviour that ExecutorServiceUtils.forEachExecutorThatLoads
already implements: a malformed META-INF/services entry skips just that entry, not the rest of
the classloader's service file. Refs apache#18386.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…s(X)

Replace direct `ServiceLoader.load(iface)` (thread-context classloader only) with the
realm-aware `PluginManager.get().loadServices(iface)` at every in-tree call site that
discovers SPI implementations. This is a no-op while plugins still ship on the system
classpath; once plugins move into isolated `ClassRealm`s, these call sites continue to
discover them through the realm walk.

Sites migrated:
- IndexService.fromServiceLoader (IndexPlugin)
- ExecutorServiceUtils static init (ExecutorServicePlugin) — also drops the bespoke
  hasNextOrSkip / per-iterator try/catch since loadServices now handles per-entry
  ServiceConfigurationError uniformly.
- ResponseStoreService.fromServiceLoader (ResponseStore)
- RecordEnricherRegistry static init (RecordEnricherFactory)
- OpChainConverterDispatcher.loadConvertersById (OpChainConverter)
- TimeBoundaryStrategyService.fromServiceLoader (TimeBoundaryStrategy)
- LogicalTableConfigSerDeProvider.fromServiceLoader (LogicalTableConfigSerDe)
- CompoundPinotMetricsFactory.Algorithm.SERVICE_LOADER (PinotMetricsFactory)

`SslContextProviderFactory` in pinot-java-client is intentionally NOT migrated: that
client library runs in user processes where there are no plugin realms, so plain
ServiceLoader is the right discovery surface there.

Refs apache#18386.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…Class

`Class.forName(...).getDeclaredConstructor().newInstance()` only consulted the broker's own
classloader and could not see implementations living in plugin realms. Replace the class
lookup with `PluginManager.get().loadClass(...)` which is realm-aware, but keep the explicit
`getDeclaredConstructor().newInstance()` pattern: PluginManager.createInstance internally uses
`getConstructor(...)` which would silently break factories with non-public no-arg
constructors.

Adds AccessControlFactoryLoaderTest covering: default null-config returns AllowAll, explicit
in-tree FQCN resolves, unknown class wraps ClassNotFoundException, package-private
constructor still works (constructor-visibility regression check). Refs apache#18386.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace the system-classpath reflection scan in `getPinotMetricsFactoryClasses` (and the matching
`Class.forName` lookup for metrics-registry listeners) with `PluginManager.loadServices` /
`PluginManager.loadClass` so plugin metrics factories living in isolated `ClassRealm`s are also
discovered.

The three in-tree metrics factories (Dropwizard, Yammer, Compound) carry both
`@AutoService(PinotMetricsFactory.class)` and `@MetricsFactory`, so the ServiceLoader path finds
the same set of classes the reflection scan used to. The `@MetricsFactory(enabled = ...)` flag
is still honored on the matched implementation.

Listener loading keeps `getDeclaredConstructor()` so non-public no-arg constructors continue to
work (same trade-off as AccessControlFactory).

Refs apache#18386.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
FakeMetricsFactory previously relied solely on the legacy `@MetricsFactory` reflection scan.
After PinotMetricUtils migrated to `PluginManager.loadServices(PinotMetricsFactory.class)`,
the discovery walks `META-INF/services` files; without `@AutoService(PinotMetricsFactory.class)`
generating one, the test fixture is invisible and `PinotMetricUtilsTest` /
`FakeMetricsAbstractMetricsTest` fail with "Failed to initialize PinotMetricsFactory".

Adding `@AutoService` mirrors the migration the proposal recommends for any third-party
`PinotMetricsFactory` impl that previously relied on the annotation scan alone.

Refs apache#18386.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Three follow-ups from the multi-domain code review of the realm-aware discovery work:

- PluginManager.loadServices now snapshots the classloader list under `synchronized (this)`
  before walking it. The previous implementation iterated `_classWorld.getRealms()` and
  `_registry.values()` directly, which races with `init() -> load(...)` that mutates both
  collections. The Javadoc now describes the snapshot semantics explicitly: a plugin
  registered after the snapshot is taken is invisible to that loadServices call, but the
  call cannot throw `ConcurrentModificationException` and cannot return a half-mutated view.

- testLoadServicesDeduplicatesAcrossClassLoaders renamed to
  testLoadServicesIsIdempotentAcrossInvocations — the original name claimed multi-classloader
  dedup coverage that the test body did not actually exercise. The renamed test now also
  asserts the impl-class identity is stable across calls.

- AccessControlFactory.loadFactory comment clarified: getDeclaredConstructor preserves
  package-private constructor support, but truly-private constructors still require
  setAccessible(true), which we do not call — matching the historical Class.forName
  contract exactly.

Refs apache#18386.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…plugin

`pinot-tools` Java code only imports `AvroUtils` and `AvroRecordReader`, both of which live in
`pinot-avro-base` (the shared library). The actual `pinot-avro` plugin module only contains the
two AvroMessageDecoder classes, which `pinot-tools` references only from a sample JSON config.

Swap the `pinot-avro` dependency for `pinot-avro-base` so the AvroMessageDecoder plugin classes
are no longer pulled into `pinot-distribution`'s shaded uber-jar via `pinot-tools`'s transitive
deps. Their runtime path stays unchanged: PluginManager loads them from
`plugins/pinot-input-format/pinot-avro/` as before.

First concrete step toward Phase 1 of apache#18386 — getting plugin code out of the main
service jar. The remaining input-format / minion-task plugins (`pinot-parquet`, `pinot-json`,
`pinot-minion-builtin-tasks`, etc.) need their own utility-library splits before the same trick
applies; this commit only does the case where the split already exists.

Refs apache#18386.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…-dep stop list

Three changes that together unblock more of Phase 1 of apache#18386:

1. `loadClass(DEFAULT_PLUGIN_NAME, name)` and the no-prefix `createInstance(name)` now walk
   every registered plugin classloader instead of only consulting the DEFAULT
   PluginClassLoader. Order is deterministic (DEFAULT PluginClassLoader → realms in
   ClassWorld registration order → other PluginClassLoader entries); the first match wins.
   When the same FQCN resolves on multiple classloaders, all subsequent matches are logged at
   WARN so operators can disambiguate. Strict `pluginName:className` lookup is unchanged.

   This is the missing piece for callers that pass a bare FQCN (every `createInstance` call
   site in the repo). Once plugin classes are no longer unconditionally on the JVM system
   classpath, those callers can still reach them. Caveat (documented in the WARN message and
   Javadoc): callers that legitimately need strict precedence must use the explicit
   `pluginName:className` form.

2. `PluginManager.load(name, dir)` now also looks for `pinot-plugin.properties` *inside* the
   plugin's jars, falling back to the historical `<dir>/pinot-plugin.properties` placement
   when neither lookup succeeds. Plugin authors can now opt into realm loading by simply
   placing the file under `src/main/resources/` of the plugin module — no per-plugin assembly
   change required. The jar scan is ordered so the jar whose name starts with the plugin
   directory name is inspected first; this keeps the cost ~O(1) for the common case (plugin
   directories with 50–200 dependency jars).

3. `pinot-distribution/pinot-assembly.xml` no longer copies `pinot-json`, `pinot-parquet`, or
   `pinot-minion-builtin-tasks` into the `plugins/` directory: `pinot-tools` imports utility
   classes (`JSONRecordReader`, `ParquetUtils`, `ParquetRecordReader`, `SegmentConversionUtils`)
   from these modules at compile time, so they already live in `pinot-all.jar`. Treating them
   as "strong" library deps avoids duplicating their bytecode at two classloader levels. The
   cleaner refactor (utility code → -base library, plugin classes → plugins/) is left for a
   follow-up; the assembly comments cross-reference the issue.

Tests: existing 605/605 in pinot-spi, plus the new realm-walk fallback assertion in
`PluginManagerTest.testSimple` covering the bare-FQCN lookup of a class registered via
`PluginManager.load(...)`.

Refs apache#18386.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds `src/main/resources/pinot-plugin.properties` to the 25 plugin modules that get copied to
the distribution's `plugins/` directory but didn't yet declare one (`pinot-dropwizard` already
had one). With the previous commit's PluginManager change that scans inside plugin jars, this
file is now discovered at runtime without any extra distribution-assembly tweaks — placing it
in `src/main/resources/` is enough.

Each file declares `parent.realmId=pinot`, which makes the plugin's ClassRealm fall back to
the parent (Pinot core) classloader for classes outside the plugin's own URLs. This matches
the legacy `PluginClassLoader`'s parent-first behaviour closely enough that plugins that
depend on `pinot-common` / `pinot-core` helper classes today keep working unchanged. Plugins
that genuinely only need `pinot-spi` can switch to the limited-realm form (an empty file)
later for tighter cross-plugin isolation — `pinot-dropwizard` is the existing example.

Out of scope (intentionally): plugin modules that are NOT shipped as plugins in the
distribution assembly — `pinot-batch-ingestion-common`, `pinot-batch-ingestion-spark-base`,
`pinot-avro-base`, `pinot-kafka-base` (shared libraries used at compile time only) plus
`pinot-json`, `pinot-parquet`, and `pinot-minion-builtin-tasks` (treated as strong deps in the
previous commit and excluded from `plugins/`). Dropping a properties file in those modules
would be inert today and noise.

Tests: existing 605/605 in pinot-spi continue to pass; plugin packaging for yammer / kafka-3.0
/ csv / s3 succeeds with the assembly profile auto-activating.

Refs apache#18386.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…fication

A standalone main-class tool (with a bash launcher) that exercises every plugin shipped in a
built distribution against the same factory APIs the broker / server / controller / minion
use at runtime. Designed to be launched as a separate JVM with `lib/pinot-all.jar` on its
classpath and `-Dplugins.dir=plugins/` set, so realm isolation is genuinely tested rather
than masked by Maven's all-plugins-on-classpath test layout.

Why a separate process and not a TestNG test:

  Surefire runs inside the build's test classloader, where every plugin module is already a
  transitive compile dep. `-Dplugins.dir=` populates PluginManager realms, but the realm
  copies of classes never actually win against the system-classloader copies the test
  classpath provides. To genuinely exercise realm isolation, the verifier has to run in a
  fresh JVM with the production classpath layout. Surefire forking can do that, but it's
  noisier than a small standalone tool you can also point at any pre-built distribution.

Layout:

  pinot-plugin-verifier/                                 ← new module, deps only on
                                                          pinot-spi + pinot-segment-spi
    src/main/java/org/apache/pinot/verifier/
      PluginVerifier.java                                ← main(), arg parsing
      checks/Check.java                                  ← per-plugin-type contract
      checks/InputFormatCheck.java                       ← RecordReader / MessageDecoder
      checks/PinotFsCheck.java                           ← PinotFS (S3, GCS, ADLS, HDFS)
      checks/StreamConsumerCheck.java                    ← Kafka, Kinesis, Pulsar
      checks/MetricsFactoryCheck.java                    ← PinotMetricsFactory via loadServices
    src/main/resources/bin/verify-plugins.sh             ← bash launcher
    README.md                                            ← usage + design
  pinot-distribution/pinot-assembly.xml                  ← copies the jar into lib/ and the
                                                          script into bin/, so a built
                                                          tarball ships with the verifier

Each check exercises the SAME factory entry point a real Pinot service uses — not a custom
shortcut — so a green run means "this plugin would actually load in production", not "the
class file is on the classpath". The verifier deliberately does NOT connect to external
systems (Kafka, S3); that's what integration tests are for.

Flags:

  --check <types>     Comma-separated subset of: input-format, stream, fs, metrics
  --plugin <fqcn>     Run only the verifier(s) targeting the named FQCN
  --strict-realm      Use pluginName:FQCN form. Default (bare FQCN) goes through the realm
                      walk added by apache#18386; --strict-realm works on both old
                      and new code, useful as a contract check
  --verbose, -v       Print getProtectionDomain().getCodeSource().getLocation() per class
                      so you can see whether each impl came from pinot-all.jar or a plugin
                      realm

Tests: 3 smoke tests in PluginVerifierSmokeTest covering --help, unknown-flag handling, and
"runs without -Dplugins.dir without throwing". Real end-to-end exercise happens by running
`build/bin/verify-plugins.sh` against an assembled tarball — that's task #10 of
apache#18386.

Refs apache#18386.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…e distribution

The pinot-fastdev profile sets shade.phase.prop=none, which suppresses the per-plugin shaded
jar the assembly copies into plugins/<type>/<name>/. Production builds always run with the
profile disabled; local dev environments that auto-activate it via ~/.m2/settings.xml need
to pass -P!pinot-fastdev (or -Dshade.phase.prop=package) explicitly when building the
distribution that verify-plugins.sh will run against.

Refs apache#18386.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…dules

- Create `pinot-json-base` with `JSONRecordReader`/`JSONRecordExtractor` and their
  tests, following the existing `pinot-avro-base` pattern. `pinot-json` now only
  contains `JSONMessageDecoder` and depends on `pinot-json-base`.

- Create `pinot-parquet-base` with `ParquetUtils` and the parquet/hadoop deps it
  needs. `pinot-parquet` (the plugin) now depends on `pinot-parquet-base` instead
  of bundling the utility directly.

- Move `SegmentConversionUtils` from `pinot-minion-builtin-tasks` (plugin scope)
  to `pinot-minion` (core module), package `org.apache.pinot.minion.utils`. Update
  the two executor base classes and `LaunchBackfillIngestionJobCommand` to import
  from the new location.

- Drop `pinot-json`, `pinot-parquet`, `pinot-minion-builtin-tasks`,
  `pinot-confluent-avro`, `pinot-csv`, `pinot-orc`, `pinot-thrift`, `pinot-protobuf`
  from `pinot-tools` compile deps; add `pinot-json-base` and `pinot-parquet-base`
  in their place. Add `kafka-clients` explicitly (was previously leaked via
  `pinot-confluent-avro`'s transitive dep).

- Remove the "strong-dep stop list" workaround from `pinot-assembly.xml`: add
  `pinot-json`, `pinot-parquet`, and `pinot-minion-builtin-tasks` as proper plugin
  entries under `plugins/` in the binary distribution.

- Add `pinot-plugin.properties` (parent.realmId=pinot) to `pinot-json`,
  `pinot-parquet`, and `pinot-minion-builtin-tasks` so they load as ClassRealms.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…de config

Relocations are moved out of the root pom and into pinot-java-client and
pinot-jdbc-client where they belong — those are the only modules that produce
embeddable driver jars where package-renaming is needed to avoid leaking
bundled jackson/guava/scala onto the embedding application's classpath.

Plugin shaded jars continue to be produced (shade.phase.prop=package default
in each plugin pom is untouched), but without relocations. Conflict resolution
for plugin deps now relies on ClassRealm isolation rather than package renaming.

Also removes the unused build-shaded-jar profiles from pinot-spi, pinot-common,
and pinot-core. Nothing in the distribution consumes shaded artifacts from
these modules, so the profiles were dead build weight.

pinot-spark-3-connector already had its own local relocations (shadeBase
prefix) and is unaffected.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… to plugin-zip layout

Each plugin now ships as a plugin-zip (thin jar under classes/ + transitive dep jars at the
plugin directory root) rather than a single shaded uber-jar.  The distribution assembly unpacks
all plugin zips under plugins/<plugin-type>/<plugin-name>/, giving PluginManager proper
classloader-realm isolation without relocation.

Changes:
- pinot-plugin.xml: add ${plugin.type}/ prefix to outputDirectory so zips unpack into the
  correct two-level subdirectory layout (e.g. plugins/pinot-stream-ingestion/pinot-kafka-3.0/)
- assembly-descriptor IT: set plugin.type=test-plugin-type and update expected paths
- 27 plugin poms: add <plugin.type> property; remove shade.phase.prop=package;
  remove build-shaded-jar profiles (kinesis, s3, timeseries-m3ql)
- pinot-avro-base: remove shade activation (library module, not a plugin artifact)
- pinot-distribution pom: add 27 plugin zip deps (scope=provided, type=zip, classifier=plugin)
- pinot-assembly.xml: replace ~25 shaded-jar <file> entries with a single <dependencySet>
  that unpacks all 27 plugin zips into plugins/
- pinot-timeseries-lang parent pom: align plugin.type to pinot-timeseries-lang (was pinot-ts-languages)

NOTE: the distribution directory layout changes — plugin directories move from
  plugins/pinot-kafka-3.0/ to plugins/pinot-stream-ingestion/pinot-kafka-3.0/ etc.
External scripts, Dockerfiles, or Helm charts that hardcode old plugin paths must be updated.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Plugin zips now use a flat single-level layout (pinot-kafka-3.0/classes/...) instead of the
two-level type-prefix layout (pinot-stream-ingestion/pinot-kafka-3.0/classes/...).
PluginManager resolves plugin names from the immediate parent directory of dep jars, so it
doesn't care about the type grouping.

Changes:
- pinot-plugin.xml: replace ${plugin.type}/${project.artifactId} with ${project.artifactId}
- Remove the <plugin.type> property from all plugin parent and leaf poms (37 files)
- Remove the <plugin.type /> default placeholder from pinot-plugins/pom.xml
- Update assembly-descriptor IT test: expected paths no longer carry the type prefix

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- PluginManager: catch NoClassDefFoundError in loadClassFromAnyPlugin realm
  walk, log at DEBUG level, and continue to the next classloader. Previously
  the error propagated uncaught and stopped the entire walk; a class whose
  bytecode lived in the system-classpath copy of a jar (e.g. pinot-all) but
  whose transitive deps were absent would cause the walk to abort instead of
  falling through to the plugin realm that has all deps bundled.
  Add a regression test that verifies ClassNotFoundException (not NCDFE) is
  thrown when every classloader in the walk fails to resolve a class.

- InputFormatCheck: fix wrong FQCN for pinot-confluent-protobuf. The actual
  class is org.apache.pinot.plugin.inputformat.protobuf.KafkaConfluent...
  (no .confluent. sub-package); the old name caused ClassNotFoundException.

- pinot-hdfs/pom.xml: use ${hadoop.dependencies.scope} (compile by default
  via pinot-plugins/pom.xml) for the hadoop-common dependency. The root DM
  declares it as provided, so without an explicit override hadoop-common was
  excluded from the plugin-zip assembly and HadoopPinotFS failed to load.

- pinot-distribution/pom.xml: add an explicit shade <executions> block with
  a hardcoded package phase. The ${shade.phase.prop}=package property in the
  child pom was not overriding the inherited root-pom execution whose phase
  resolves in the root context as none. This left pinot-all-*.jar absent
  when the Gradle build cache expired.

- pinot-thrift/pom.xml: remove the explicit provided scope on libthrift. The
  root DM declares it compile; the child had overridden it to provided for the
  old shading architecture. With the plugin-zip layout libthrift must be
  bundled alongside ThriftRecordReader.

- pinot-batch-ingestion-spark-3/pom.xml: same explicit shade fix as
  pinot-distribution. Delete pinot-plugin.properties from hadoop and spark-3
  batch ingestion modules — these are external-process connectors (Hadoop MR,
  Spark executor) that do not run inside a Pinot service and must not be
  loaded as Pinot realms.

- pinot-tools/pom.xml: add pinot-parquet as a test-scoped dependency so the
  PinotSegmentConverterTest can compile against ParquetRecordReader.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
pinot-kafka-4.0 was inadvertently left out of Phase 2/3: it still had
shade.phase.prop=package (so it was still producing a shaded jar) and was
absent from pinot-distribution's plugin-zip dependency list, meaning Kafka 4.x
deployments would silently fall back to legacy classloader behavior.

- Add pinot-plugin.properties (parent.realmId=pinot) so the module is loaded
  as a ClassRealm by PluginManager, consistent with pinot-kafka-3.0.
- Remove shade.phase.prop=package; the plugin-assembly profile (activated by
  the presence of pinot-plugin.properties) now produces the plugin-zip instead.
- Add pinot-kafka-4.0 as a provided plugin-zip dependency in
  pinot-distribution/pom.xml and include it in pinot-assembly.xml.
- Add KafkaConsumerFactory (kafka40 package) to StreamConsumerCheck so the
  plugin verifier covers it alongside kafka-3.0.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@gortiz gortiz force-pushed the pinot-no-shading branch from d14a087 to 276998f Compare May 12, 2026 14:29
- pinot-plugin-verifier: convert all /// (JEP-467 Markdown doc) comments to
  standard /** */ Javadoc. The project's checkstyle config does not recognise
  the triple-slash syntax and flags it as 10 violations in CI (the Gradle
  build cache had masked this locally).

- pom.xml: add **/META-INF/services/** to the apache-rat excludes list.
  ServiceLoader registration files list FQCNs only and carry no license header;
  the existing exclusion for **/java.sql.* and **/org.apache.spark.* covers
  the same pattern for specific known files, but the general glob is needed
  for test-resource service files like the one added in pinot-spi.

- PluginManagerTest: replace /// regression-test comment with a plain //
  block comment to stay consistent with the rest of the test file style.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 13, 2026

Codecov Report

❌ Patch coverage is 66.90141% with 47 lines in your changes missing coverage. Please review.
✅ Project coverage is 55.86%. Comparing base (7fe517a) to head (ea2edc8).
⚠️ Report is 14 commits behind head on master.

Files with missing lines Patch % Lines
...ava/org/apache/pinot/spi/plugin/PluginManager.java 69.49% 29 Missing and 7 partials ⚠️
...org/apache/pinot/spi/metrics/PinotMetricUtils.java 64.70% 5 Missing and 1 partial ⚠️
.../common/utils/LogicalTableConfigSerDeProvider.java 0.00% 1 Missing ⚠️
...g/apache/pinot/segment/spi/index/IndexService.java 0.00% 0 Missing and 1 partial ⚠️
...apache/pinot/spi/cursors/ResponseStoreService.java 0.00% 1 Missing ⚠️
...pache/pinot/spi/executor/ExecutorServiceUtils.java 0.00% 1 Missing ⚠️
...rdtransformer/enricher/RecordEnricherRegistry.java 0.00% 1 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (7fe517a) and HEAD (ea2edc8). Click for more details.

HEAD has 16 uploads less than BASE
Flag BASE (7fe517a) HEAD (ea2edc8)
java-21 5 1
unittests 2 1
temurin 5 1
unittests2 1 0
integration 3 0
integration1 1 0
custom-integration1 1 0
integration2 1 0
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #18459      +/-   ##
============================================
- Coverage     63.68%   55.86%   -7.83%     
============================================
  Files          3262     2555     -707     
  Lines        199826   148368   -51458     
  Branches      31031    23971    -7060     
============================================
- Hits         127264    82879   -44385     
+ Misses        62414    58400    -4014     
+ Partials      10148     7089    -3059     
Flag Coverage Δ
custom-integration1 ?
integration ?
integration1 ?
integration2 ?
java-21 55.86% <66.90%> (-7.83%) ⬇️
temurin 55.86% <66.90%> (-7.83%) ⬇️
unittests 55.86% <66.90%> (-7.83%) ⬇️
unittests1 55.86% <66.90%> (+0.10%) ⬆️
unittests2 ?

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

apache-rat flags the README as unapproved because it has no license header.
Adding it as an HTML comment, matching the pattern used by other module READMEs
(e.g. pinot-connectors/pinot-spark-3-connector/README.md).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@yashmayya yashmayya added enhancement Improvement to existing functionality buildAndPackage Related to build system and packaging labels May 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

buildAndPackage Related to build system and packaging enhancement Improvement to existing functionality

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants