Skip to content

[#4884] Fold Reduce_*(axes) op when all reduced axes have length 1#4885

Open
itikhono wants to merge 1 commit into
ROCm:developfrom
itikhono:simplify-reduce-no-op
Open

[#4884] Fold Reduce_*(axes) op when all reduced axes have length 1#4885
itikhono wants to merge 1 commit into
ROCm:developfrom
itikhono:simplify-reduce-no-op

Conversation

@itikhono
Copy link
Copy Markdown
Contributor

Motivation

migraphx-driver compile --gpu crashes on graphs where a Reduce* reduces over axes of length 1: it reaches GPU lowering as a fused_reduce<{N, 1}> with no valid HIP kernel. Hits any YOLO26-pose export from Ultralytics; the post-processing emits Slice → ReduceMax(axes=[-1]) over [1, 8400, 1].

Repro (real model)

pip install ultralytics onnx
python -c "from ultralytics import YOLO; YOLO('yolo26s-pose.pt').export(format='onnx', imgsz=640, simplify=False, device='cpu')"

migraphx-driver compile --onnx yolo26s-pose.onnx --gpu
# -> crashes in HIP codegen of fused_reduce<{8400, 1}>

See issue #4884 for more details

Fix

New find_reduce_no_op matcher in simplify_algebra for reduce_max / min / sum / prod / mean / any / all. Replaces the op with its input when every entry of axes resolves to a unit dim on a static shape. Skips dynamic shapes and the 2-input (runtime-axes) form.

simplify_reshapes::find_nop_reshapes already lists reduce ops but compares full shapes including strides. After Slice the input has non-canonical strides while the reduce output is canonical, so it skips the reduce. We can't relax that check (other ops in the same matcher rely on strides). find_reduce_no_op does a reduce-only, lens-only check.

Tests

test/simplify_algebra_test.cpp:

  • folds: _singleton_axis, _negative_axis, _multi_axes, _yolo_pose_shape ({1, 8400, 1})
  • kept: _keeps_real_reduce, _keeps_partial_singleton
  • skipped: _skips_dynamic_input, _skips_variable_axes

End-to-end compile --gpu on yolo26s-pose.onnx and the minimal repro now completes.

Changelog Category

    • Resolved Issues: Known issues from a previous version that have been resolved.

Copilot AI review requested due to automatic review settings May 14, 2026 18:16
@itikhono itikhono requested a review from causten as a code owner May 14, 2026 18:16
@causten causten requested a review from CharlieL7 May 14, 2026 18:18
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses a GPU compilation failure where Reduce* ops that reduce only over unit-length axes (i.e., mathematically no-ops) can reach GPU lowering as fused_reduce<{N,1}> and crash due to missing valid HIP kernel configurations (reported in #4884, observed in YOLO26-pose exports). The fix introduces a simplify_algebra matcher that removes these no-op reductions when the reduced axes are statically known to have length 1.

Changes:

  • Add find_reduce_no_op to simplify_algebra to fold reduce_* ops into their input when all reduced axes are singleton and the input shape is static (skipping dynamic shapes and runtime-axes form).
  • Add unit tests covering common folding/keeping/skipping scenarios and a YOLO pose-shape regression case.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
src/simplify_algebra.cpp Adds a new matcher (find_reduce_no_op) and wires it into simplify_algebra’s match pipeline to eliminate singleton-axis reductions before GPU lowering.
test/simplify_algebra_test.cpp Adds targeted tests asserting when singleton-axis reductions should fold vs. be preserved/skipped.

Comment thread src/simplify_algebra.cpp
Comment on lines +2378 to +2386
auto matcher() const
{
return match::name("reduce_max",
"reduce_min",
"reduce_sum",
"reduce_prod",
"reduce_mean",
"reduce_any",
"reduce_all");
Comment thread src/simplify_algebra.cpp
Comment on lines +2401 to +2413
auto axes = ins->get_operator().to_value()["axes"].to_vector<std::int64_t>();
if(axes.empty())
return;

const auto& lens = sh.lens();
const auto rank = static_cast<std::int64_t>(lens.size());
const bool all_singleton = std::all_of(axes.begin(), axes.end(), [&](std::int64_t a) {
if(a < 0)
a += rank;
return a >= 0 and a < rank and lens[a] == 1;
});
if(all_singleton)
m.replace_instruction(ins, in);
Comment on lines +5389 to +5408
TEST_CASE(simplify_reduce_no_op_singleton_axis)
{
check_reduce_folds("reduce_max", {1, 21, 1}, {2});
}

TEST_CASE(simplify_reduce_no_op_negative_axis)
{
check_reduce_folds("reduce_sum", {1, 21, 1}, {-1});
}

TEST_CASE(simplify_reduce_no_op_multi_axes)
{
check_reduce_folds("reduce_mean", {1, 1, 21, 1}, {0, 1, 3});
}

// yolo*-pose regression: without the fold, GPU lowering JIT-fails on this shape.
TEST_CASE(simplify_reduce_no_op_yolo_pose_shape)
{
check_reduce_folds("reduce_max", {1, 8400, 1}, {-1});
}
@causten causten requested a review from pfultz2 May 14, 2026 18:24
@pfultz2
Copy link
Copy Markdown
Collaborator

pfultz2 commented May 14, 2026

#4841 already fixes this issue by extending find_nop_reshapes.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 14, 2026

Codecov Report

❌ Patch coverage is 95.83333% with 1 line in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/simplify_algebra.cpp 95.83% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff            @@
##           develop    #4885   +/-   ##
========================================
  Coverage    92.86%   92.86%           
========================================
  Files          585      585           
  Lines        30152    30213   +61     
========================================
+ Hits         27998    28056   +58     
- Misses        2154     2157    +3     
Files with missing lines Coverage Δ
src/simplify_algebra.cpp 98.42% <95.83%> (-0.05%) ⬇️

... and 6 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@itikhono
Copy link
Copy Markdown
Contributor Author

#4841 already fixes this issue by extending find_nop_reshapes.

Ok, I will double check today

@itikhono
Copy link
Copy Markdown
Contributor Author

#4841 already fixes this issue by extending find_nop_reshapes.

Ok, I will double check today

It seems that #4841 resolves the issue
I will close my PR when #4841 is merged

@kahmed10
Copy link
Copy Markdown
Collaborator

#4841 already fixes this issue by extending find_nop_reshapes.

Ok, I will double check today

It seems that #4841 resolves the issue I will close my PR when #4841 is merged

Please try now that it's merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants