Skip to content

fix(sql): handle GROUP BY ALL with aliased aggregates#20943

Open
kumarUjjawal wants to merge 1 commit intoapache:mainfrom
kumarUjjawal:fix/sql_group_by
Open

fix(sql): handle GROUP BY ALL with aliased aggregates#20943
kumarUjjawal wants to merge 1 commit intoapache:mainfrom
kumarUjjawal:fix/sql_group_by

Conversation

@kumarUjjawal
Copy link
Contributor

Which issue does this PR close?

Rationale for this change

Issue #20909 reported that:

  • SELECT COUNT(*) FROM t GROUP BY ALL works
  • SELECT COUNT(*) AS c FROM t GROUP BY ALL fails at execution

The failure came from an incorrect plan that grouped by an aggregate alias expression.

What changes are included in this PR?

This PR fixes GROUP BY ALL when the select list has an aliased aggregate like COUNT(*) AS c.

Changes:

  • Update GROUP BY ALL expansion logic to skip any select expression that contains an aggregate, even if it is wrapped in aliases.
  • Add SQL planner tests for:
    • SELECT COUNT(*) FROM ... GROUP BY ALL
    • SELECT COUNT(*) AS c FROM ... GROUP BY ALL
  • Add core execution test for:
    • SELECT COUNT(*) AS c FROM ... GROUP BY ALL

Are these changes tested?

Yes

Are there any user-facing changes?

@github-actions github-actions bot added sql SQL Planner core Core DataFusion crate labels Mar 14, 2026
Copy link
Contributor

@neilconway neilconway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall lgtm!

}

#[tokio::test]
async fn count_aggregated_group_by_all_alias() -> Result<()> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this would be better as an SLT test?

}

#[test]
fn select_count_with_group_by_all() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The two tests in this file pass without the fix. Probably best to remove them?

// Filter and collect non-aggregate select expressions
// 'group by all' groups wrt. all select expressions except aggregate expressions.
// Note: aggregate expressions can be nested under one or more aliases
// (e.g. count(1) AS count(*) AS c), so detect aggregates recursively.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is a bit misleading, because that SQL statement doesn't parse. The problem seems to actually occur in practice because of a (nested) alias inserted by the analyzer, which AFAICS you can't do from SQL. I'd probably just remove the change to the comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate sql SQL Planner

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Incorrectly parsed SQL query containing GROUP BY ALL and an alias in the projection

2 participants