Add support for window function EXCLUDE clause#18482
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #18482 +/- ##
============================================
+ Coverage 63.68% 63.70% +0.01%
Complexity 1684 1684
============================================
Files 3262 3267 +5
Lines 199835 200191 +356
Branches 31034 31147 +113
============================================
+ Hits 127266 127530 +264
- Misses 62416 62460 +44
- Partials 10153 10201 +48
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
xiangfu0
left a comment
There was a problem hiding this comment.
Found 1 high-signal issue; see inline comment.
| int32 lowerBound = 5; | ||
| int32 upperBound = 6; | ||
| repeated Literal constants = 7; | ||
| WindowExclusion exclude = 8; |
There was a problem hiding this comment.
This extends the broker-to-server plan wire format without any mixed-version guard. A newer broker can now serialize exclude != EXCLUDE_NO_OTHERS to an older server, and that server will just ignore field 8 / the new enum and execute legacy NO_OTHERS semantics. During a rolling upgrade that turns EXCLUDE CURRENT ROW/GROUP/TIES into silently wrong results instead of a rejected query, so this needs a capability/version gate or broker-side fallback before merge.
Summary
Adds support for the SQL standard
EXCLUDEclause on window functions, covering all four options:EXCLUDE NO OTHERS(default; existing behavior preserved)EXCLUDE CURRENT ROWEXCLUDE GROUPEXCLUDE TIESSupported for the window functions where it is semantically meaningful —
SUM,COUNT,AVG,MIN,MAX,BOOL_AND,BOOL_OR,FIRST_VALUE,LAST_VALUE— across bothROWSandRANGEframes. Ranking functions andLAG/LEADcontinue to be framed implicitly per the SQL standard (Calcite rejectsEXCLUDEon these at parse time).Implementation
Plan side:
WindowExclusionproto enum onWindowNode(field 8, default 0 =EXCLUDE_NO_OTHERSso old serialized plans round-trip safely).RelToPlanNodeConverter/PRelToPlanNodeConverterpropagate the exclusion through; the previousPreconditions.checkStaterejecting non-default exclusions is removed.PlanNodeToRelConverter, both serde sides, andPlanNodeMergerround-trip the new field.Runtime side:
WindowFramecarries the exclusion;WindowFunctionbase gains O(n)computePeerBoundaries+ O(1)firstNonExcluded/lastNonExcludedhelpers. The defaultEXCLUDE NO OTHERSpath branches out early so the hot path is unchanged.AggregateWindowFunctionhandles ROWS and all four supported RANGE shapes (UU / UC / CU / CC) using a sliding aggregator with per-row apply / unapply correction. Peer bounds are skipped forEXCLUDE CURRENT ROWwhen frame shape allows.FirstValueWindowFunction/LastValueWindowFunctioncompute the effective first / last index in O(1) per row from peer bounds;IGNORE NULLScontinues to work.SortedMultisetMinMaxWindowValueAggregator(TreeMap-backed, O(log K) per op) is selected when EXCLUDE forces per-row corrections. SUM / COUNT / AVG / BOOL_AND / BOOL_OR are commutative under add / remove and reuse the existing aggregators.Semantics were cross-verified against PostgreSQL.
Test plan
pinot-query-runtime/src/test/resources/queries/WindowFunctions.jsonexercising each of the four EXCLUDE options acrossSUM/COUNT/AVG/MIN/FIRST_VALUE/LAST_VALUE, plus ROWS / all four RANGE shapes / no-ORDER BY. Each expected output was generated from PostgreSQL.SortedMultisetMinMaxWindowValueAggregator(min / max with duplicates, out-of-order removal, no-op removal of an unknown value, null handling,BigDecimal).ResourceBasedQueriesTest,WindowAggregateOperatorTest, andWindowValueAggregatorTestsuites pass.spotless:apply/checkstyle:check/license:checkclean.Backwards / rolling-upgrade notes
The proto field is additive with the standard proto3 zero-default (
EXCLUDE_NO_OTHERS). New brokers will continue to plan queries withoutEXCLUDEto the same wire shape as today. A new broker that plans a non-defaultEXCLUDEand dispatches to an old server will see the server silently default the field toEXCLUDE_NO_OTHERS; servers should be upgraded before brokers if operators expect the new SQL syntax to take effect.