Skip to content

[hipDNN] Add RMSNorm backward CPU reference implementation#7494

Merged
saikubairkota merged 5 commits into
developfrom
streamhpc/saikubairkota/rmsnorm-backward-cpu-ref
May 29, 2026
Merged

[hipDNN] Add RMSNorm backward CPU reference implementation#7494
saikubairkota merged 5 commits into
developfrom
streamhpc/saikubairkota/rmsnorm-backward-cpu-ref

Conversation

@saikubairkota
Copy link
Copy Markdown
Contributor

@saikubairkota saikubairkota commented May 15, 2026

Motivation

This PR adds the RMSNorm backward CPU reference implementation to the hipDNN test SDK.

Technical Details

  • Added the CPU reference implementation of the RMSNorm backward operation.
  • Implemented CpuFpReferenceRMSNorm::backward() that calculates the gradients dx, dscale, and dbias .
  • Added RMSNormBwdPlan, RMSNormBwdPlanBuilder, and RMSNormBwdSignatureKey for executing RMSNorm backward graph operations.
  • Added unit tests for the plan, signature key, and the CPU reference implementation.

Test Plan

Build hipDNN and run the relevant unit tests with ./bin/hipdnn_test_sdk_tests --gtest_filter="*RMSNormBwd*".

Test Result

All relevant unit tests pass successfully on an MI210.

  • ninja check - full test suite

Submission Checklist

@saikubairkota saikubairkota requested a review from a team as a code owner May 15, 2026 07:21
@saikubairkota saikubairkota self-assigned this May 15, 2026
@saikubairkota saikubairkota added the organization: streamhpc contributors from streamhpc label May 15, 2026
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 15, 2026

Codecov Report

❌ Patch coverage is 83.48624% with 54 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...ities/cpu_graph_executor/detail/RMSNormBwdPlan.hpp 70.30% 14 Missing and 16 partials ⚠️
...ipdnn_test_sdk/utilities/CpuFpReferenceRMSNorm.hpp 89.92% 9 Missing and 3 partials ⚠️
...u_graph_executor/detail/RMSNormBwdSignatureKey.hpp 88.57% 8 Missing and 4 partials ⚠️

❌ Your project status has failed because the head coverage (77.83%) is below the target coverage (80.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #7494      +/-   ##
===========================================
+ Coverage    61.93%   61.95%   +0.02%     
===========================================
  Files         2084     2086       +2     
  Lines       357263   357576     +313     
  Branches     54006    54043      +37     
===========================================
+ Hits        221253   221521     +268     
- Misses      117207   117231      +24     
- Partials     18803    18824      +21     
Flag Coverage Δ *Carryforward flag
TensileLite 25.95% <ø> (ø) Carriedforward from 06b571e
hipBLAS 90.65% <ø> (ø) Carriedforward from 06b571e
hipBLASLt 41.27% <ø> (ø) Carriedforward from 06b571e
hipCUB 82.21% <ø> (ø) Carriedforward from 06b571e
hipDNN 86.89% <83.49%> (-0.01%) ⬇️
hipFFT 50.00% <ø> (ø) Carriedforward from 06b571e
hipRAND 76.12% <ø> (ø) Carriedforward from 06b571e
hipSOLVER 69.24% <ø> (ø) Carriedforward from 06b571e
hipSPARSE 85.42% <ø> (ø) Carriedforward from 06b571e
rocBLAS 48.09% <ø> (ø) Carriedforward from 06b571e
rocFFT 52.07% <ø> (ø) Carriedforward from 06b571e
rocRAND 57.04% <ø> (ø) Carriedforward from 06b571e
rocSOLVER 77.83% <ø> (ø) Carriedforward from 06b571e
rocSPARSE 72.68% <ø> (ø) Carriedforward from 06b571e

*This pull request uses carry forward flags. Click here to find out more.

Files with missing lines Coverage Δ
...s/cpu_graph_executor/CpuReferenceGraphExecutor.hpp 82.14% <100.00%> (+0.44%) ⬆️
.../cpu_graph_executor/detail/PlanBuilderRegistry.hpp 100.00% <ø> (ø)
...graph_executor/detail/PlanRegistrySignatureKey.hpp 100.00% <ø> (ø)
...ipdnn_test_sdk/utilities/CpuFpReferenceRMSNorm.hpp 90.71% <89.92%> (+4.81%) ⬆️
...u_graph_executor/detail/RMSNormBwdSignatureKey.hpp 88.57% <88.57%> (ø)
...ities/cpu_graph_executor/detail/RMSNormBwdPlan.hpp 70.30% <70.30%> (ø)

... and 2 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@EwanC EwanC requested review from BalintCsala and EwanC May 15, 2026 12:06
@saikubairkota saikubairkota force-pushed the streamhpc/saikubairkota/rmsnorm-backward-cpu-ref branch from 3ccd5f5 to 7dc6f79 Compare May 18, 2026 08:02
Copy link
Copy Markdown
Contributor

@EwanC EwanC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@saikubairkota saikubairkota force-pushed the streamhpc/saikubairkota/rmsnorm-backward-cpu-ref branch from 7dc6f79 to 66fda72 Compare May 21, 2026 07:08
Copy link
Copy Markdown
Member

@BalintCsala BalintCsala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@saikubairkota saikubairkota force-pushed the streamhpc/saikubairkota/rmsnorm-backward-cpu-ref branch 4 times, most recently from 77761aa to 88871d4 Compare May 26, 2026 10:08
@adickin-amd
Copy link
Copy Markdown
Contributor

looked like CI hit some infra errors on your PR. I merged in develop to get it to run again

Copy link
Copy Markdown
Contributor

@adickin-amd adickin-amd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, miopen-provider failure is unrelated to these change. the code coverage issue also looks unrelated. Ive brought up the code coverage issue with hipdnn-core to discuss if we can exlude mocks from coverage.

@saikubairkota saikubairkota force-pushed the streamhpc/saikubairkota/rmsnorm-backward-cpu-ref branch 3 times, most recently from 0e2136d to 2d62162 Compare May 27, 2026 16:31
@saikubairkota saikubairkota force-pushed the streamhpc/saikubairkota/rmsnorm-backward-cpu-ref branch from 2d62162 to 89bafdb Compare May 28, 2026 06:52
@saikubairkota saikubairkota merged commit c7d49aa into develop May 29, 2026
32 checks passed
@saikubairkota saikubairkota deleted the streamhpc/saikubairkota/rmsnorm-backward-cpu-ref branch May 29, 2026 07:09
saikubairkota added a commit that referenced this pull request May 29, 2026
…channel last support (#7702)

**Caution**: This PR should be merged only after [this
PR](#7494) is merged.

## Motivation

This PR implements the RMSNorm backward kernels and RMSNorm channel-last
support for both forward and backward operations in the hip kernel
provider.

## Technical Details

- Adds the RMSNorm backward kernels and makes relevant changes in
`RMSnormBwdPlan` to compile and launch the kernels.
- Adds channel last support for both `RMSnormFwd` and `RMSnormBwd`
operations.
- Adds/updates unit tests and integration tests to test the changes
introduced in this PR.

## Test Plan

Build the plugin and run the unit and integration tests with `ninja
check`.

## Test Result

All unit and integration tests pass successfully on an MI210.

## Submission Checklist

- [ ] Look over the contributing guidelines at
https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants