feat: support nullary aggregate UDFs#23038
Open
ametel01 wants to merge 12 commits into
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
This partially addresses #16453 by supporting the
window_start()/window_end()style zero-argument aggregate UDF workaround discussed in the issue thread. It does not implement projection of non-table virtual columns such asSELECT window_start, window_end, ....Rationale for this change
DataFusion already rewrites SQL
count()/count(*)to a physical count overCOUNT_STAR_EXPANSION, but user-defined aggregate functions with explicit nullary signatures could not reliably flow through planning and physical execution. This blocked using aggregate metadata functions that naturally take no input arguments.What changes are included in this PR?
TypeSignature::Nullary.countconstruction rejected, so callers must use the existingCOUNT_STAR_EXPANSIONrepresentation forcount(*).Are these changes tested?
Yes. Added coverage for:
countconstruction returning a planning error.Local checks run:
cargo fmt --all -- --checkcargo test -p datafusion-functions-aggregate-common groups_accumulatorcargo test -p datafusion --test user_defined_integration test_zero_argument_udafcargo test -p datafusion-physical-plan count_requires_physical_argumentcargo test -p datafusion-sql --test sql_integration plan_zero_argument_aggregate_udfcargo clippy --all-targets --all-features -- -D warningsAre there any user-facing changes?
Yes. DataFusion can now plan and execute aggregate UDFs declared with
Signature::nullary(...), including grouped and filtered aggregate queries.