[multistage] Fix null handling for several aggregation functions#18471
[multistage] Fix null handling for several aggregation functions#18471dang-stripe wants to merge 5 commits into
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #18471 +/- ##
============================================
- Coverage 63.68% 63.67% -0.01%
- Complexity 1682 1684 +2
============================================
Files 3262 3262
Lines 199826 199836 +10
Branches 31031 31033 +2
============================================
Hits 127255 127255
+ Misses 62421 62420 -1
- Partials 10150 10161 +11
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| @@ -205,7 +205,7 @@ public ColumnDataType getFinalResultColumnType() { | |||
|
|
|||
| @Override | |||
| public Long extractFinalResult(Long intermediateResult) { | |||
There was a problem hiding this comment.
Annotate it @Nullable
| } else { | ||
| _mergeResultHolder = new Object[numFunctions]; | ||
| for (int i = 0; i < numFunctions; i++) { | ||
| _mergeResultHolder[i] = aggFunctions[i].extractAggregationResult( |
There was a problem hiding this comment.
I originally added this in our fork before I saw the fix in #17750. It was intended to be a different way of addressing the same bug.
When all segments are pruned by the broker, the final aggregate stage receives no input blocks so _mergeResultHolder[i] will be null and calling getResult() will trigger an NPE. This felt like a more comprehensive change so we don't need to update each aggregation function for null handling.
But then I realized returning identity for all aggregation types might not be right here (depends on the aggregation) so I've reverted this change.
…or pruned-segment safety"
7951dea to
d74384d
Compare
| @@ -102,7 +103,7 @@ public Object extractGroupByResult(GroupByResultHolder groupByResultHolder, int | |||
| } | |||
|
|
|||
| @Override | |||
There was a problem hiding this comment.
(minor) Also annotate return as @Nullable, same for other places
|
@yashmayya @Jackie-Jiang wanted to discuss if there's a better fix that doesn't require updating all aggregation functions for null handling. i checked postgres and all functions return null except count-based functions. https://www.postgresql.org/docs/current/functions-aggregate.html
so we could do something like this in |
This is a complementary change to #17750 to address errors with
Cannot invoke "java.util.Set.size()" because "intermediateResult" is nullthat we are running internally. We don't have #17750 pulled down yet, but figured it's worth upstreaming this anyway since it's an extra layer of defense against the shape of error. We did notice a bug with the count aggregation returning null instead of 0 so we also fixed that here too.This happened to us after rolling out MSE broker pruning and when all segments were pruned in the query since there's no MSE broker logic yet to short circuit the query in the broker. This adds a comment to implement support.
cc @yashmayya @Jackie-Jiang @timothy-e