Skip to content

[GLUTEN-12255][VL] Rewrite multi-children Count in window expressions#12256

Open
WangGuangxin wants to merge 2 commits into
apache:mainfrom
WangGuangxin:window_count
Open

[GLUTEN-12255][VL] Rewrite multi-children Count in window expressions#12256
WangGuangxin wants to merge 2 commits into
apache:mainfrom
WangGuangxin:window_count

Conversation

@WangGuangxin

@WangGuangxin WangGuangxin commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

What changes are proposed in this pull request?

Similiar to #4471, since Velox only supports count() / count(T) for window functions. Spark's
count(c1, c2, ...) variant must be rewritten into count(if(or(isnull(c1),isnull(c2), ...), null, 1)) so the WindowExec can still be offloaded, which has already be handled in AggregateExec

How was this patch tested?

More UT

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Copilot

Related issue: #12255

@github-actions github-actions Bot added CORE works for Gluten Core VELOX labels Jun 7, 2026
@github-actions

github-actions Bot commented Jun 7, 2026

Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

1 similar comment
@github-actions

github-actions Bot commented Jun 7, 2026

Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@github-actions

github-actions Bot commented Jun 8, 2026

Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@github-actions

github-actions Bot commented Jun 8, 2026

Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends Gluten’s physical-plan rewrite rule for multi-argument count to also cover window functions, enabling WindowExec offload to Velox by rewriting Spark’s count(c1, c2, ...) form into an equivalent single-argument count(if(...)) expression.

Changes:

  • Extend RewriteMultiChildrenCount to rewrite multi-children Count inside WindowExec window expressions (in addition to the existing partial-aggregate rewrite).
  • Add Velox backend validation to explicitly reject un-rewritten multi-argument window count and fallback to vanilla Spark instead of crashing.
  • Add Velox UTs to ensure (a) offload happens after rewrite and (b) results match vanilla Spark semantics for nullable inputs.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
gluten-substrait/src/main/scala/org/apache/gluten/extension/columnar/rewrite/RewriteMultiChildrenCount.scala Add WindowExec support and factor out shared “single-child Count” construction logic.
backends-velox/src/main/scala/org/apache/gluten/backendsapi/velox/VeloxBackend.scala Mark multi-arg window Count as unsupported to force fallback when rewrite is not applied.
backends-velox/src/test/scala/org/apache/gluten/functions/WindowFunctionsValidateSuite.scala Add tests covering offload + semantic equivalence for rewritten multi-arg window count.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@github-actions

Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

WangGuangxin and others added 2 commits June 12, 2026 09:58
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@github-actions

Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CORE works for Gluten Core VELOX

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants