Skip to content

ci: run strict-no-cover in scripts/test to catch stale pragmas locally#2305

Merged
maxisbey merged 3 commits intomainfrom
fix/local-strict-no-cover
Mar 17, 2026
Merged

ci: run strict-no-cover in scripts/test to catch stale pragmas locally#2305
maxisbey merged 3 commits intomainfrom
fix/local-strict-no-cover

Conversation

@maxisbey
Copy link
Contributor

@maxisbey maxisbey commented Mar 17, 2026

Adds strict-no-cover as a final step in ./scripts/test so stale # pragma: no cover annotations are caught locally before pushing. Updates CLAUDE.md with accurate testing guidance and documents the three coverage-pragma types.

Motivation and Context

PR #2302 passed ./scripts/test (100% coverage, all green) but failed CI in two ways:

  1. strict-no-cover flagged lines 61–63 of src/mcp/server/stdio.py — the new test executed them, but they still carried # pragma: no cover. coverage report saw 100% because the pragma excludes the line; strict-no-cover saw that a "never covered" line was in fact executed.
  2. Python 3.14 Windows branch coverage — a nested async with in the test produced a missing ->exit arc, a known coverage.py quirk.

CLAUDE.md claimed ./scripts/test "matches CI exactly." It didn't: CI runs strict-no-cover as a separate step after coverage report (Linux only), and CI runs the full 3.10–3.14 × {ubuntu, windows} matrix. The first gap is closed here; the second is documented.

CI does not invoke scripts/test — it inlines the coverage commands in shared.yml — so this change is purely local and does not run strict-no-cover twice.

How Has This Been Tested?

Reproduced the #2302 failure mode: added # pragma: no cover to a line that tests/server/test_stdio.py executes, then ran both paths.

Full path (./scripts/test):

  • Before: coverage report → 100%, exit 0. Stale pragma undetected.
  • After: coverage report → 100%, then strict-no-cover❎ 1 lines wrongly marked, exit 1. ~21s wall time (vs ~20s before).

Fast targeted path (documented in CLAUDE.md):

uv run --frozen coverage erase
uv run --frozen coverage run -m pytest tests/server/test_stdio.py
uv run --frozen coverage combine
uv run --frozen coverage report --include='src/mcp/server/stdio.py' --fail-under=0
UV_FROZEN=1 uv run --frozen strict-no-cover

~4s, catches the stale pragma, no false positives on partial runs (strict-no-cover only reports excluded ∩ executed — fewer executed lines just means fewer chances to catch, never a wrong flag).

uv.lock pollution fix: strict-no-cover internally spawns subprocess.run(['uv', 'run', 'coverage', 'json', ...]) — no --frozen. On machines with registry overrides this rewrites uv.lock. Setting UV_FROZEN=1 propagates to the child process. Verified uv.lock stays clean through a full run.

Breaking Changes

None. ./scripts/test takes ~1s longer and exits nonzero when a stale pragma is found.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • Documentation update

Checklist

  • I have read the MCP Documentation
  • My code follows the repository's style guidelines
  • New and existing tests pass locally
  • I have added appropriate error handling
  • I have added or updated documentation as needed

Additional context

The # pragma: no branch quirk for nested async with is documented as what-it-is, not what-to-do-about-it; the existing ~100 annotations in the tree are the pattern reference.

PR #2302 passed ./scripts/test (100% coverage) but failed CI when
strict-no-cover found lines marked '# pragma: no cover' that the new
test actually executed. The claim in CLAUDE.md that ./scripts/test
'matches CI exactly' was false.

Adds strict-no-cover as a final step in scripts/test. The tool
internally spawns 'uv run coverage json' without --frozen, which
rewrites uv.lock on machines with registry overrides; UV_FROZEN=1
propagates to that subprocess.

Also converts session.py:426 from 'no cover' to 'lax no cover'. The
except-handler during connection cleanup is a genuine race (stream may
or may not already be closed depending on timing), so it is
nondeterministically covered. strict-no-cover would have flagged it
intermittently on high-core machines.

CLAUDE.md now documents the fast targeted path (single test file +
strict-no-cover, ~4s, no false positives on partial runs), the three
pragma types, and the known coverage.py ->exit arc quirk with nested
async-with that only the CI matrix catches.
@maxisbey maxisbey marked this pull request as draft March 17, 2026 18:19
The session.py:426 flag appeared exactly once across 10 runs, and that
one run was the only one with a confounding test edit in place. CI has
never flagged this line; 8 clean local runs never flagged it either.
The original pragma was correct.
Copy link

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — straightforward CI/DX improvement with documentation updates and a single pragma annotation change.

Extended reasoning...

Overview

This PR adds strict-no-cover as a final step in scripts/test, updates CLAUDE.md with accurate testing guidance and coverage-pragma documentation, and changes one pragma in src/mcp/shared/session.py from # pragma: no cover to # pragma: lax no cover. The session.py change is purely an annotation change (no logic modification) to prevent flaky failures from a nondeterministically-covered exception handler.

Security risks

None. The changes are limited to a shell script (CI tooling), a documentation file, and a comment annotation. No production code paths, authentication, or data handling are affected.

Level of scrutiny

Low scrutiny warranted. This is a developer-experience improvement: the script change adds one line to run an existing tool, the documentation update is comprehensive and accurate, and the pragma change is well-justified in the PR description (race condition in subprocess coverage on high-core machines).

Other factors

The two bug reports found are both minor: coverage.json missing from .gitignore is a nit, and the unquoted $@ is pre-existing. Neither affects the correctness of this PR. The PR description is thorough, explaining the motivation (CI/local parity gap from PR #2302), testing methodology, and the UV_FROZEN=1 rationale for preventing uv.lock pollution.

Drop the subprocess-flake bullet and the 'add this pragma' prescription
for no-branch. Scope lax-no-cover to platform-specific coverage only.
@maxisbey maxisbey marked this pull request as ready for review March 17, 2026 19:26
Copy link

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — straightforward CI script addition and documentation update.

Extended reasoning...

Overview

This PR adds a single line to scripts/test (UV_FROZEN=1 uv run --frozen strict-no-cover) and updates CLAUDE.md with accurate testing guidance including coverage pragma documentation. Two files changed, both non-production.

Security risks

None. The changes are to a local developer test script and a documentation file. No production code, auth, or data handling is affected.

Level of scrutiny

Low scrutiny appropriate. This is a developer tooling improvement — adding a linting step to a local test script and correcting documentation that previously overclaimed CI parity. The UV_FROZEN=1 environment variable propagation is a thoughtful detail to prevent uv.lock mutation.

Other factors

My two previous inline comments (missing coverage.json in .gitignore and unquoted $@) are both marked resolved. The new nit about the CI matrix description omitting the lowest-direct dimension is a minor documentation accuracy issue — the text already correctly states the script is "Not identical to CI," which is the important message. The PR is well-motivated by a concrete CI failure (PR #2302) and thoroughly tested as described in the PR body.

Comment on lines +32 to +33
- Full check: `./scripts/test` (~23s). Runs coverage + `strict-no-cover` on the
default Python. Not identical to CI: CI also runs 3.10–3.14 × {ubuntu, windows},
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 The CI matrix description says "3.10–14 × {ubuntu, windows}" (10 entries), but .github/workflows/shared.yml defines a three-dimensional matrix: python-version × dep-resolution × os (5×2×2 = 20 entries), omitting the {locked, lowest-direct} dimension. Consider updating to "CI also runs 3.10–14 × {ubuntu, windows} × {locked, lowest-direct}" — the lowest-direct strategy is especially worth mentioning since those failures cannot be reproduced locally with --frozen.

Extended reasoning...

What the bug is

CLAUDE.md lines 32-33 state:

Not identical to CI: CI also runs 3.10–14 × {ubuntu, windows},
and some branch-coverage quirks only surface on specific matrix entries.

This implies the CI matrix has 5 Python versions × 2 OSes = 10 entries. However, the actual CI matrix in .github/workflows/shared.yml (lines 49-56) defines three dimensions:

matrix:
  python-version: ["3.10", "3.11", "3.12", "3.13", "3.14"]
  dep-resolution:
    - name: lowest-direct
      install-flags: "--upgrade --resolution lowest-direct"
    - name: locked
      install-flags: "--frozen"
  os: [ubuntu-latest, windows-latest]

That is 5 × 2 × 2 = 20 matrix entries, not 10.

Why the omission matters

The lowest-direct resolution strategy (--upgrade --resolution lowest-direct) tests compatibility with minimum acceptable dependency versions. This is a meaningfully different CI dimension because:

  1. A dependency feature used by MCP might not exist in the lowest-direct resolved version, causing a CI failure that cannot be reproduced locally where --frozen uses the lockfile (latest) versions.
  2. A developer seeing failures only on lowest-direct matrix entries would have no context from CLAUDE.md to understand why — the documentation only mentions Python versions and OS as CI dimensions.

Step-by-step proof

  1. Developer reads CLAUDE.md line 32-33: "CI also runs 3.10–14 × {ubuntu, windows}".
  2. Developer infers the CI matrix is 10 entries varying only by Python version and OS.
  3. Developer pushes a change that uses a dependency feature added in a recent version.
  4. CI fails on lowest-direct matrix entries where the older dependency version lacks that feature.
  5. Developer is confused because the failure does not correlate with any Python version or OS — the two dimensions CLAUDE.md told them about.

Impact

This is a minor documentation inaccuracy. The text already represents a significant improvement over the previous claim that ./scripts/test "matches CI exactly," and the sentence does correctly state that scripts/test is "Not identical to CI." The missing dimension is unlikely to cause real confusion in most cases, but including it would make the documentation precisely accurate and help developers diagnose lowest-direct-specific CI failures.

Suggested fix

Change line 33 from:

CI also runs 3.10–14 × {ubuntu, windows},

to:

CI also runs 3.10–14 × {ubuntu, windows} × {locked, lowest-direct},

@maxisbey maxisbey merged commit ff50351 into main Mar 17, 2026
34 checks passed
@maxisbey maxisbey deleted the fix/local-strict-no-cover branch March 17, 2026 19:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants