Skip to content

Make MoE dispatch/MLP expert-axis batch sharding configurable (fix Mixtral EP throughput)#4179

Open
gulsumgudukbay wants to merge 4 commits into
AI-Hypercomputer:mainfrom
ROCm:fix-moe-expert-parallel-sharding
Open

Make MoE dispatch/MLP expert-axis batch sharding configurable (fix Mixtral EP throughput)#4179
gulsumgudukbay wants to merge 4 commits into
AI-Hypercomputer:mainfrom
ROCm:fix-moe-expert-parallel-sharding

Address review: peel 'expert' from MoE dispatch/MLP batch dim instead…

d16d561
Select commit
Loading
Failed to load commit list.
Google CLA / cla/google succeeded Jun 18, 2026 in 8s

✅ All contributors are covered under a CLA with Google

See https://cla.developers.google.com/ for more info about Google's Contributor License Agreement (CLA).

ℹ️ Googlers: Go here to view more details and manage scans for this pull request.

Details

The following contributors were found for this pull request:

d16d561 Author: @gulsumgudukbay <gu****ay​@gmail.com>

(Only the first commit for a unique contributor is listed.)