Skip to content

Build Automodel compiled dependencies in CI image#15737

Merged
pzelasko merged 34 commits into
mainfrom
codex/automodel-compiled-deps
Jun 3, 2026
Merged

Build Automodel compiled dependencies in CI image#15737
pzelasko merged 34 commits into
mainfrom
codex/automodel-compiled-deps

Conversation

@pzelasko
Copy link
Copy Markdown
Collaborator

Summary

  • Add a compiled-deps wheel stage to docker/Dockerfile.ci for NeMo Automodel packages.
  • Build and install TransformerEngine, DeepEP V1, nv-grouped-gemm, causal-conv1d, mamba-ssm, and flash-attn into CI Dockerfile images.
  • Add GPU_TARGET with default h100plus and a100 option; A100 applies a DeepEP V1 patch to disable NVSHMEM.
  • Keep CI resource settings and Docker runtime resource limits unchanged.

Validation

  • bash -n over Dockerfile heredoc snippets
  • git diff --check
  • docker buildx build --call=check -f docker/Dockerfile.ci .
  • Local constrained builds/import smoke tests were run before the final review cleanup for both GPU_TARGET=h100plus and GPU_TARGET=a100.

chtruong814 and others added 28 commits May 13, 2026 13:40
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
This reverts commit 8c5a48e.

Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>
Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>
Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>
Use decoder vocab size when generating synthetic TDT transcript labels so duration outputs from the joint are not sampled as labels.

Move CUDA graph compile exception types into cuda_python_utils per review feedback.

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>
@pzelasko pzelasko requested a review from a team as a code owner May 29, 2026 16:57
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 29, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

chtruong814
chtruong814 previously approved these changes Jun 1, 2026
Copy link
Copy Markdown
Collaborator

@chtruong814 chtruong814 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I guess if we have conditional patches for what has to build from source, probably need to do it the way you have. Can take a look later to see if we can push anymore into the pyproject and uv lock. But this shoudl be fien as well.

pzelasko added 2 commits June 1, 2026 17:04
Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>
Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>
@pzelasko
Copy link
Copy Markdown
Collaborator Author

pzelasko commented Jun 2, 2026

/ok to build 16710b8

Comment thread tests/functional_tests/speechlm_automodel_compiled_deps_smoke.py Dismissed
@pzelasko
Copy link
Copy Markdown
Collaborator Author

pzelasko commented Jun 2, 2026

/ok to test 16710b8

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>
@github-actions github-actions Bot removed the ASR label Jun 2, 2026
@pzelasko
Copy link
Copy Markdown
Collaborator Author

pzelasko commented Jun 2, 2026

/ok to test 87e557e

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 2, 2026

[🤖]: Hi @pzelasko 👋,

We wanted to let you know that a CICD pipeline for this PR just finished successfully.

So it might be time to merge this PR or get some approvals.

@pzelasko pzelasko merged commit 5b7cfcd into main Jun 3, 2026
253 of 258 checks passed
@pzelasko pzelasko deleted the codex/automodel-compiled-deps branch June 3, 2026 00:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants