Skip to content

Fetch from nvidia Megatron-LM#5

Open
RaymondLi0 wants to merge 7011 commits intoElementAI:load-iterfrom
NVIDIA:main
Open

Fetch from nvidia Megatron-LM#5
RaymondLi0 wants to merge 7011 commits intoElementAI:load-iterfrom
NVIDIA:main

Conversation

@RaymondLi0
Copy link
Copy Markdown

No description provided.

tdene and others added 29 commits March 4, 2026 20:47
Co-authored-by: Siddharth Singh <sidsingh@nvidia.com>
Signed-off-by: qiyuw <qiyuw@nvidia.com>
Co-authored-by: qiyuw <qiyuw@nvidia.com>
Co-authored-by: oliver könig <okoenig@nvidia.com>
Co-authored-by: Kunlun Li <94586211+kunlunl@users.noreply.github.com>
…unit tests. (#3524)

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Deepak Narayanan <dnarayanan@nvidia.com>
Signed-off-by: oliver könig <okoenig@nvidia.com>
Signed-off-by: Keshav Santhanam <ksanthanam@nvidia.com>
Signed-off-by: oliver könig <okoenig@nvidia.com>
Signed-off-by: oliver könig <okoenig@nvidia.com>
Signed-off-by: oliver könig <okoenig@nvidia.com>
Signed-off-by: Keshav Santhanam <ksanthanam@nvidia.com>
Signed-off-by: oliver könig <okoenig@nvidia.com>
Signed-off-by: oliver könig <okoenig@nvidia.com>
…`. (#2639)

Co-authored-by: Philip Petrakian <ppetrakian@nvidia.com>
Signed-off-by: oliver könig <okoenig@nvidia.com>
Signed-off-by: Lifu Zhang <lifuz@login-lyris01.lyris.clusters.nvidia.com>
Signed-off-by: Lifu Zhang <lifuz@login-lyris02.lyris.clusters.nvidia.com>
Co-authored-by: Lifu Zhang <lifuz@login-lyris01.lyris.clusters.nvidia.com>
Co-authored-by: Lifu Zhang <lifuz@login-lyris02.lyris.clusters.nvidia.com>
Co-authored-by: Philip Petrakian <ppetrakian@nvidia.com>
Co-authored-by: Philip Petrakian <ppetrakian@nvidia.com>
Signed-off-by: Keshav Santhanam <ksanthanam@nvidia.com>
Co-authored-by: Philip Petrakian <ppetrakian@nvidia.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
Co-authored-by: Philip Petrakian <ppetrakian@nvidia.com>
skyw and others added 30 commits April 2, 2026 23:50
Signed-off-by: Hao Wu <skyw@nvidia.com>
…4114)

Signed-off-by: Keshav Santhanam <ksanthanam@nvidia.com>
Signed-off-by: dimapihtar <dpihtar@gmail.com>
Signed-off-by: dimapihtar <dpykhtar@nvidia.com>
Signed-off-by: dimapihtar <dpihtar@gmail.com>
Signed-off-by: dimapihtar <dpykhtar@nvidia.com>
Signed-off-by: Hao Wu <skyw@nvidia.com>
Co-authored-by: Cory Ye <44509866+cspades@users.noreply.github.com>
Signed-off-by: Akshat Kumar <akshat230405@gmail.com>
…#4084)

Signed-off-by: Deyu Fu <deyuf@nvidia.com>
Co-authored-by: Tom Long <tolong@oci-hsg-cs-001-vscode-02.cm.cluster>
Co-authored-by: yaoyu-33 <yaoyu.094@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Deyu Fu <deyuf@nvidia.com>
…dance (#4035)

Signed-off-by: oliver könig <okoenig@nvidia.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: oliver könig <okoenig@nvidia.com>
Signed-off-by: oliver könig <okoenig@nvidia.com>
Signed-off-by: Maanu Grover <maanug@nvidia.com>
Co-authored-by: Antoni-Joan Solergibert <asolergibert@nvidia.com>
Co-authored-by: Philip Petrakian <ppetrakian@nvidia.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…4140)

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…dlock (#4139)

Signed-off-by: oliver könig <okoenig@nvidia.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Hao Wu <skyw@nvidia.com>
Signed-off-by: meg miranda <mmiranda@nvidia.com>
…ubscriptable`) by not saving a checkpoint after a transient NaN / Inf (#3981)

Co-authored-by: Dmytro Pykhtar <37850217+dimapihtar@users.noreply.github.com>
Signed-off-by: Hao Wu <skyw@nvidia.com>
Signed-off-by: Cory Ye <cye@nvidia.com>
Co-authored-by: Cory Ye <cye@nvidia.com>
Co-authored-by: conver334 <conver334@gmail.com>
Signed-off-by: Faradawn Yang <73060648+faradawn@users.noreply.github.com>
Signed-off-by: Keshav Santhanam <ksanthanam@nvidia.com>
…graphs) (#4085)

Signed-off-by: Keshav Santhanam <ksanthanam@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.