Skip to content

Add support for window function EXCLUDE clause#18482

Open
yashmayya wants to merge 2 commits into
apache:masterfrom
yashmayya:feature/window-function-exclude-clause
Open

Add support for window function EXCLUDE clause#18482
yashmayya wants to merge 2 commits into
apache:masterfrom
yashmayya:feature/window-function-exclude-clause

Conversation

@yashmayya
Copy link
Copy Markdown
Contributor

Summary

Adds support for the SQL standard EXCLUDE clause on window functions, covering all four options:

  • EXCLUDE NO OTHERS (default; existing behavior preserved)
  • EXCLUDE CURRENT ROW
  • EXCLUDE GROUP
  • EXCLUDE TIES

Supported for the window functions where it is semantically meaningful — SUM, COUNT, AVG, MIN, MAX, BOOL_AND, BOOL_OR, FIRST_VALUE, LAST_VALUE — across both ROWS and RANGE frames. Ranking functions and LAG/LEAD continue to be framed implicitly per the SQL standard (Calcite rejects EXCLUDE on these at parse time).

Implementation

Plan side:

  • New WindowExclusion proto enum on WindowNode (field 8, default 0 = EXCLUDE_NO_OTHERS so old serialized plans round-trip safely).
  • RelToPlanNodeConverter / PRelToPlanNodeConverter propagate the exclusion through; the previous Preconditions.checkState rejecting non-default exclusions is removed.
  • PlanNodeToRelConverter, both serde sides, and PlanNodeMerger round-trip the new field.

Runtime side:

  • WindowFrame carries the exclusion; WindowFunction base gains O(n) computePeerBoundaries + O(1) firstNonExcluded / lastNonExcluded helpers. The default EXCLUDE NO OTHERS path branches out early so the hot path is unchanged.
  • AggregateWindowFunction handles ROWS and all four supported RANGE shapes (UU / UC / CU / CC) using a sliding aggregator with per-row apply / unapply correction. Peer bounds are skipped for EXCLUDE CURRENT ROW when frame shape allows.
  • FirstValueWindowFunction / LastValueWindowFunction compute the effective first / last index in O(1) per row from peer bounds; IGNORE NULLS continues to work.
  • Existing monotonic-deque MIN/MAX aggregators don't support arbitrary removal, so a new SortedMultisetMinMaxWindowValueAggregator (TreeMap-backed, O(log K) per op) is selected when EXCLUDE forces per-row corrections. SUM / COUNT / AVG / BOOL_AND / BOOL_OR are commutative under add / remove and reuse the existing aggregators.

Semantics were cross-verified against PostgreSQL.

Test plan

  • 11 new EXCLUDE cases in pinot-query-runtime/src/test/resources/queries/WindowFunctions.json exercising each of the four EXCLUDE options across SUM / COUNT / AVG / MIN / FIRST_VALUE / LAST_VALUE, plus ROWS / all four RANGE shapes / no-ORDER BY. Each expected output was generated from PostgreSQL.
  • Unit tests for SortedMultisetMinMaxWindowValueAggregator (min / max with duplicates, out-of-order removal, no-op removal of an unknown value, null handling, BigDecimal).
  • Full ResourceBasedQueriesTest, WindowAggregateOperatorTest, and WindowValueAggregatorTest suites pass.
  • spotless:apply / checkstyle:check / license:check clean.

Backwards / rolling-upgrade notes

The proto field is additive with the standard proto3 zero-default (EXCLUDE_NO_OTHERS). New brokers will continue to plan queries without EXCLUDE to the same wire shape as today. A new broker that plans a non-default EXCLUDE and dispatches to an old server will see the server silently default the field to EXCLUDE_NO_OTHERS; servers should be upgraded before brokers if operators expect the new SQL syntax to take effect.

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 13, 2026

Codecov Report

❌ Patch coverage is 71.66667% with 85 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.70%. Comparing base (d88887e) to head (bbf5da5).
⚠️ Report is 21 commits behind head on master.

Files with missing lines Patch % Lines
.../query/runtime/operator/window/WindowFunction.java 40.74% 21 Missing and 11 partials ⚠️
...ator/window/aggregate/AggregateWindowFunction.java 86.15% 8 Missing and 10 partials ⚠️
.../query/planner/logical/PlanNodeToRelConverter.java 0.00% 7 Missing ⚠️
...operator/window/value/LastValueWindowFunction.java 68.18% 2 Missing and 5 partials ⚠️
...perator/window/value/FirstValueWindowFunction.java 72.72% 2 Missing and 4 partials ⚠️
...inot/query/planner/serde/PlanNodeDeserializer.java 37.50% 4 Missing and 1 partial ⚠️
.../pinot/query/planner/serde/PlanNodeSerializer.java 28.57% 4 Missing and 1 partial ⚠️
...he/pinot/query/planner/explain/PlanNodeMerger.java 0.00% 2 Missing ⚠️
...pache/pinot/query/planner/plannode/WindowNode.java 80.00% 0 Missing and 1 partial ⚠️
...not/query/runtime/operator/window/WindowFrame.java 80.00% 1 Missing ⚠️
... and 1 more
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #18482      +/-   ##
============================================
+ Coverage     63.68%   63.70%   +0.01%     
  Complexity     1684     1684              
============================================
  Files          3262     3267       +5     
  Lines        199835   200191     +356     
  Branches      31034    31147     +113     
============================================
+ Hits         127266   127530     +264     
- Misses        62416    62460      +44     
- Partials      10153    10201      +48     
Flag Coverage Δ
custom-integration1 100.00% <ø> (ø)
integration 100.00% <ø> (ø)
integration1 100.00% <ø> (ø)
integration2 0.00% <ø> (ø)
java-21 63.70% <71.66%> (+0.01%) ⬆️
temurin 63.70% <71.66%> (+0.01%) ⬆️
unittests 63.70% <71.66%> (+0.01%) ⬆️
unittests1 55.83% <71.66%> (+0.05%) ⬆️
unittests2 34.92% <0.00%> (-0.03%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@yashmayya yashmayya added multi-stage Related to the multi-stage query engine window-functions Related to SQL window functions on the multi-stage query engine labels May 13, 2026
Copy link
Copy Markdown
Contributor

@xiangfu0 xiangfu0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found 1 high-signal issue; see inline comment.

int32 lowerBound = 5;
int32 upperBound = 6;
repeated Literal constants = 7;
WindowExclusion exclude = 8;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This extends the broker-to-server plan wire format without any mixed-version guard. A newer broker can now serialize exclude != EXCLUDE_NO_OTHERS to an older server, and that server will just ignore field 8 / the new enum and execute legacy NO_OTHERS semantics. During a rolling upgrade that turns EXCLUDE CURRENT ROW/GROUP/TIES into silently wrong results instead of a rejected query, so this needs a capability/version gate or broker-side fallback before merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

multi-stage Related to the multi-stage query engine window-functions Related to SQL window functions on the multi-stage query engine

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants