[pinot-server/ proactive-query-killing] (2/2) integrate query scan cost based instrumentation and killing in server operators#18475
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #18475 +/- ##
==========================================
Coverage 63.68% 63.68%
- Complexity 1682 1684 +2
==========================================
Files 3262 3266 +4
Lines 199826 200057 +231
Branches 31031 31073 +42
==========================================
+ Hits 127255 127415 +160
- Misses 62421 62468 +47
- Partials 10150 10174 +24
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| "Query '%s' (requestId=%d) on table '%s' was killed because '%s' (%,d) exceeded the threshold (%,d) " | ||
| + "configured in %s. " | ||
| + "At kill time: entriesScannedInFilter=%,d, docsScanned=%,d, " | ||
| + "entriesScannedPostFilter=%,d, elapsedMs=%d. " |
There was a problem hiding this comment.
[HIGH] Reporting entriesScannedPostFilter here makes the feature look wired end-to-end, but this PR never actually makes that metric meaningful. None of the new operator instrumentation calls QueryScanCostContext.addEntriesScannedPostFilter(), and the default ScanEntriesThresholdStrategy still ignores the corresponding cluster/table thresholds. Any deployment that sets maxEntriesScannedPostFilter will therefore get a silent no-op guardrail while this message still prints the field. Please either plumb post-filter accounting through the operators and strategy, or drop the advertised threshold from this change.
There was a problem hiding this comment.
Have updated the logic to instrument maxEntriesScannedPostFilter measurement and killing check as well
Extends QueryConfig with a scanKillingMode field (disabled/logOnly/enforce) that lets individual tables override the cluster-level scan killing mode. Stores the resolved mode as a volatile field on QueryExecutionContext during query init and applies it in QueryKillingManager.checkAndKillWithStrategy() ahead of the cluster config — enabling patterns like cluster=logOnly + table=enforce for targeted enforcement without a cluster-wide mode change. Invalid mode strings are rejected at QueryConfig construction time via Preconditions. Includes 4 new manager tests and 7 new QueryConfig tests. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
QueryKillingManager.onChange() received raw ZK keys with the full "pinot.query.scheduler." prefix but passed them directly to QueryMonitorConfig's update constructor, which checks for keys without that prefix (e.g. "accounting.scan.based.killing.mode"). The mismatch caused all dynamic config changes to be silently ignored, requiring a server restart for scan-killing config to take effect. Strip the prefix before passing to QueryMonitorConfig, matching the key space the init constructor uses. Add early return when no relevant keys changed, and log the applied config values after rebuild. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…ck to global when table name unavailable
Follow up for SPI Changes - #18102
Summary
Scan cost instrumentation across all operators
Instruments 11 query operators and
DocIdSetOperatorto pushnumDocsScanned,numEntriesScannedInFilter, andnumEntriesScannedPostFilterdeltas into the per-queryQueryScanCostContexton every block iteration, enabling cooperative scan-based query killingat block boundaries. The
getScanCostContext()helper is consolidated intoBaseOperatoras aprotected staticmethod (removing 11 duplicate copies), andcheckScanBasedKilling()iscalled from both
checkTermination()andcheckTerminationAndSampleUsage()so combine-pathoperators are also covered.
Per-table scanKillingMode override
Adds a
scanKillingModefield toQueryConfig(validated at construction, not silently atquery time) allowing individual tables to override the cluster-level kill mode. The resolved
mode is stored as a
volatile ScanKillingModeonQueryExecutionContextduring queryinitialization and applied in
QueryKillingManager.checkAndKillWithStrategy()ahead of thecluster config — enabling patterns like cluster=
logOnly+ table=enforcefor targetedenforcement without a cluster-wide mode change. Invalid mode strings fail fast at table config
submission rather than degrading silently at query time.
QueryKillingManager hardening
Wires
QueryKillingManagerintoBaseServerStarterfor singleton initialization and ZKcluster config change listener registration, so scan killing thresholds and mode can be tuned
live without server restart. Adds
synchronized onChange()for atomic config/strategyrebuilds, a local
currentStrategysnapshot incheckAndKillIfNeededto eliminate a TOCTOUrace on config changes, and a warn log when an unexpected type is found in the cached strategy
slot.
Design: Two-path kill check
The scan-based killing integration has two cooperating paths:
QueryScanCostContextafter each block.BaseOperator.checkTermination()/checkTerminationAndSampleUsage()): On everynextBlock()call across the operator tree, the killing manager evaluates whether accumulated cost exceeds thresholds and terminates the query if so.Test plan
QueryKillReportTest(4),QueryKillingManagerTest(20, including 6 new),QueryMonitorConfigScanKillingTest(9),CompositeQueryKillingStrategyTest(8),ScanEntriesThresholdStrategyTest(13)Functional testing plan on quick start for all scenarios