Skip to content

[https://nvbugs/6281014][fix] fix the repeated cute.compile and simpilify the test#15331

Open
JadoTu wants to merge 2 commits into
NVIDIA:mainfrom
JadoTu:bugfix_6281014
Open

[https://nvbugs/6281014][fix] fix the repeated cute.compile and simpilify the test#15331
JadoTu wants to merge 2 commits into
NVIDIA:mainfrom
JadoTu:bugfix_6281014

Conversation

@JadoTu

@JadoTu JadoTu commented Jun 13, 2026

Copy link
Copy Markdown
Collaborator

Summary by CodeRabbit

  • Tests
    • Streamlined test suite by simplifying matrix parameters and configurations.
    • Updated GPU system test mappings for improved efficiency.
    • Removed deprecated test cases and associated waiver entries.

Description

  1. Turn on the padding of cuda graph to avoid repeated cute.compile
  2. Delete some tests because the Qwen3-Next is not the main focus, accelerating the ci.

PR Checklist

Please review the following before submitting your PR:

  • PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.

  • PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.

  • Test cases are provided for new code paths (see test instructions)

  • If PR introduces API changes, an appropriate PR label is added - either api-compatible or api-breaking. For api-breaking, include BREAKING in the PR title.

  • Any new dependencies have been scanned for license and vulnerabilities

  • CODEOWNERS updated if ownership changes

  • Documentation updated as needed

  • Update tava architecture diagram if there is a significant design change in PR.

  • The reviewers assigned automatically/manually are appropriate for the PR.

  • Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

Signed-off-by: jiant <107457950+JadoTu@users.noreply.github.com>
@JadoTu JadoTu requested review from a team as code owners June 13, 2026 05:44
@coderabbitai

coderabbitai Bot commented Jun 13, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: efbecc0b-18e9-4894-a2ba-dbf07240f4b2

📥 Commits

Reviewing files that changed from the base of the PR and between bb32597 and 9157ed3.

📒 Files selected for processing (6)
  • tests/integration/defs/accuracy/references/mmlu.yaml
  • tests/integration/defs/accuracy/test_llm_api_pytorch.py
  • tests/integration/test_lists/qa/llm_function_core.txt
  • tests/integration/test_lists/test-db/l0_gb200_multi_gpus.yml
  • tests/integration/test_lists/test-db/l0_rtx_pro_6000.yml
  • tests/integration/test_lists/waives.txt
💤 Files with no reviewable changes (5)
  • tests/integration/test_lists/test-db/l0_gb200_multi_gpus.yml
  • tests/integration/defs/accuracy/references/mmlu.yaml
  • tests/integration/test_lists/waives.txt
  • tests/integration/test_lists/test-db/l0_rtx_pro_6000.yml
  • tests/integration/test_lists/qa/llm_function_core.txt

📝 Walkthrough

Walkthrough

The PR simplifies test coverage for TestQwen3NextInstruct::test_nvfp4 by reducing parametrized test variants to two configurations, adjusting CUDA graph padding, removing MMLU evaluation, dropping NVFP4 accuracy references, and narrowing CI test matrices to match the streamlined test.

Changes

Qwen3NextInstruct NVFP4 Test Simplification

Layer / File(s) Summary
Test implementation and accuracy reference updates
tests/integration/defs/accuracy/test_llm_api_pytorch.py, tests/integration/defs/accuracy/references/mmlu.yaml
test_nvfp4 parametrization reduced to tp1_block_reuse and tp4ep4_adp_on variants; cuda_graph_config sets enable_padding=True; MMLU evaluation removed leaving GSM8K only; NVFP4 accuracy reference (85.08) removed from Qwen3 model data.
Test list and CI matrix updates
tests/integration/test_lists/qa/llm_function_core.txt, tests/integration/test_lists/test-db/l0_gb200_multi_gpus.yml, tests/integration/test_lists/test-db/l0_rtx_pro_6000.yml, tests/integration/test_lists/waives.txt
Removed obsolete test_nvfp4 variants from QA allowlist, GB200 pre-merge matrix, and RTX 6000 post-merge tests; dropped waiver for tp1-cutlass variant to align with simplified test coverage.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Possibly related PRs

  • NVIDIA/TensorRT-LLM#15061: Both PRs modify tests/integration/test_lists/waives.txt by removing specific TestQwen3NextInstruct::test_nvfp4 waiver entries.
  • NVIDIA/TensorRT-LLM#14278: Both PRs directly impact GSM8K evaluation in integration tests; main PR removes MMLU and keeps only GSM8K, while retrieved PR adjusts parse_gsm8k_output accuracy extraction.

Suggested reviewers

  • tburt-nv
  • xinhe-nv
  • nv-guomingz
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title directly addresses the main changes: fixing repeated cute.compile through CUDA graph padding and simplifying tests by removing Qwen3-Next variants.
Description check ✅ Passed The description explains the two main reasons for changes (CUDA graph padding, test reduction) and confirms the PR checklist, covering the essential context needed for understanding the PR.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
⚔️ Resolve merge conflicts
  • Resolve merge conflict in branch bugfix_6281014

Comment @coderabbitai help to get the list of available commands and usage tips.

Signed-off-by: Jian Tu <107457950+JadoTu@users.noreply.github.com>
@JadoTu

JadoTu commented Jun 13, 2026

Copy link
Copy Markdown
Collaborator Author

/bot run --disable-fail-fast

@tensorrt-cicd

Copy link
Copy Markdown
Collaborator

PR_Github #54030 [ run ] triggered by Bot. Commit: c6158a8 Link to invocation

@tensorrt-cicd

Copy link
Copy Markdown
Collaborator

PR_Github #54030 [ run ] completed with state SUCCESS. Commit: c6158a8
/LLM/main/L0_MergeRequest_PR pipeline #43114 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

@JadoTu

JadoTu commented Jun 13, 2026

Copy link
Copy Markdown
Collaborator Author

/bot run

@tensorrt-cicd

Copy link
Copy Markdown
Collaborator

PR_Github #54062 [ run ] triggered by Bot. Commit: c6158a8 Link to invocation

@tensorrt-cicd

Copy link
Copy Markdown
Collaborator

PR_Github #54062 [ run ] completed with state FAILURE. Commit: c6158a8
/LLM/main/L0_MergeRequest_PR pipeline #43146 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants