Skip to content

test(api-core): re-enable the three rack-handler tests against current behavior#2735

Merged
chet merged 1 commit into
NVIDIA:mainfrom
chet:gh-issue-2715
Jun 22, 2026
Merged

test(api-core): re-enable the three rack-handler tests against current behavior#2735
chet merged 1 commit into
NVIDIA:mainfrom
chet:gh-issue-2715

Conversation

@chet

@chet chet commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

The three rack state-controller tests surfaced by #2694 were #[ignore]d because they assert an old state-machine vocabulary the handlers no longer use -- not because of any handler bug (confirmed: created.rs/discovering.rs are correct, corroborated by adjacent passing tests). They now pin current behavior and run again:

  • test_expected_incomplete_device_counts_stays → asserts Wait (was DoNothing).
  • test_discovering_waits_for_compute_ready → renamed test_discovering_waits_when_compute_not_ready, asserts Wait (was Err).
  • test_expected_more_discovered_than_expected_transitions → now seeds two hosts against the single-compute profile so over-discovery genuinely exceeds expected, keeping the Transition(Discovering) assertion; dropped the dead mac1 line + stale comment.

No production change. Verified: cargo test -p carbide-api-core --lib -- rack_state_controller::handler42 passed, 0 ignored; clippy --locked --all-targets --all-features + nightly rustfmt clean.

Addresses #2715.

…t behavior

Three rack state-controller tests were `#[ignore]`d because they asserted an old
state-machine vocabulary the handlers no longer use -- not because of any handler
bug. They now pin the handlers' actual, intended behavior and run again.

`created.rs` returns `Wait` (not `DoNothing`) while device counts are below the
profile's expectation and transitions once they meet or exceed it;
`discovering.rs` waits for compute to become ready rather than faulting on a
missing host. So `test_expected_incomplete_device_counts_stays` now asserts
`Wait`; `test_discovering_waits_for_compute_ready` becomes
`test_discovering_waits_when_compute_not_ready` and asserts `Wait`; and
`test_expected_more_discovered_than_expected_transitions` now seeds two hosts
against the single-compute profile so it genuinely exercises the over-discovery
transition it always claimed to test.

No production change -- the handlers were already correct, corroborated by the
adjacent passing tests. Addresses NVIDIA#2715.

Signed-off-by: Chet Nichols III <chetn@nvidia.com>
@chet

chet commented Jun 22, 2026

Copy link
Copy Markdown
Contributor Author

@coderabbitai PTAL, thanks!

@copy-pr-bot

copy-pr-bot Bot commented Jun 22, 2026

Copy link
Copy Markdown

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@coderabbitai

coderabbitai Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Important

Review skipped

No new commits to review since the last review.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 67322df3-ccf3-413b-b414-de2309713ebb

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review

Walkthrough

Three previously ignored rack state-controller handler tests are re-enabled by removing their #[ignore] attributes. Their assertions are updated to expect StateHandlerOutcome::Wait in place of prior error or stale outcomes, and the over-discovery test receives a reworked fixture that seeds two managed host records to correctly exceed the expected compute minimum.

Changes

Rack State-Controller Test Fixes

Layer / File(s) Summary
Re-enable incomplete device counts test
crates/api-core/src/tests/rack_state_controller/handler.rs
Removes #[ignore] from test_expected_incomplete_device_counts_stays and updates the assertion to StateHandlerOutcome::Wait { .. } with "wait in Created" messaging.
Re-enable over-discovery test with fixture rework
crates/api-core/src/tests/rack_state_controller/handler.rs
Removes #[ignore] from test_expected_more_discovered_than_expected_transitions, seeds two managed host fixtures under the "Single" rack profile to exceed the expected compute minimum, and removes a stale ordering-branch comment.
Rename and fix Discovering-state wait test
crates/api-core/src/tests/rack_state_controller/handler.rs
Renames the test to test_discovering_waits_when_compute_not_ready, removes #[ignore], clarifies the rack setup intent by documenting absence of Ready/Assigned compute states, and replaces the result.is_err() assertion with StateHandlerOutcome::Wait { .. }.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related issues

Possibly related PRs

  • NVIDIA/infra-controller#2716: Modifies the same rack state-controller test file with analogous changes to #[ignore] removal and expected transition assertion updates.
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the primary change: re-enabling three previously ignored rack-handler tests with updated behavior assertions.
Description check ✅ Passed The description comprehensively details the rationale, specific test changes, and validation results, clearly relating to the changeset.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai

coderabbitai Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== PR Details ==="
gh pr view 2735 --repo NVIDIA/infra-controller --json title,body,files,additions,deletions,state

echo ""
echo "=== PR Diff ==="
gh pr diff 2735 --repo NVIDIA/infra-controller

Length of output: 7570



@chet: The changes are clean and well-reasoned. Here is my assessment:

Overall: Approved — test-only, no production risk.


Per-test review

Test Assessment
test_expected_incomplete_device_counts_stays ✅ Straightforward DoNothing → Wait correction. Error message updated to match.
test_expected_more_discovered_than_expected_transitions ✅ Significant improvement. The old path called db_rack::update on a connection that had already gone out of meaningful scope and never actually seeded compute hosts — the test was asserting nothing real. Two new_host() calls now genuinely exercise the over-discovery threshold. The explicit drop(txn) before those calls is correct.
test_discovering_waits_when_compute_not_ready ✅ The rename is more precise. Switching from result.is_err() to .await? + matches!(outcome, Wait) is a tighter contract: if the handler faults, the test now fails with an error rather than silently passing as a false positive.

One minor observation

In test_discovering_waits_when_compute_not_ready, the assertion diagnostic uses:

"Discovering should wait when compute hosts are not yet ready, got {:?}",
std::mem::discriminant(&outcome)

std::mem::discriminant yields an opaque integer, not a variant name. If StateHandlerOutcome derives Debug, replacing this with {outcome:?} would give a more actionable failure message. Non-blocking — the test logic itself is correct.

[approve]

@chet

chet commented Jun 22, 2026

Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@chet chet marked this pull request as ready for review June 22, 2026 06:35
@chet chet requested a review from a team as a code owner June 22, 2026 06:35
@chet

chet commented Jun 22, 2026

Copy link
Copy Markdown
Contributor Author

@coderabbitai full review

@coderabbitai

coderabbitai Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor
✅ Action performed

Full review finished.

@chet

chet commented Jun 22, 2026

Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@github-actions

Copy link
Copy Markdown

🔍 Container Scan Summary

Service Total Critical High Medium Low Other
boot-artifacts-aarch64 3 0 0 3 0 0
boot-artifacts-x86_64 3 0 0 3 0 0
forge-admin-cli-x86_64 264 6 23 99 6 130
machine-validation-runner 704 34 183 258 35 194
machine_validation 704 34 183 258 35 194
nvmetal-carbide 704 34 183 258 35 194
TOTAL 2382 108 572 879 111 712

Per-CVE detail lives in the per-service grype-* artifacts (JSON + SARIF). Severity counts only — no CVE IDs published here.

@chet chet merged commit 4acacdb into NVIDIA:main Jun 22, 2026
55 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants