ci: run strict-no-cover in scripts/test to catch stale pragmas locally by maxisbey · Pull Request #2305 · modelcontextprotocol/python-sdk

maxisbey · 2026-03-17T18:17:10Z

Adds strict-no-cover as a final step in ./scripts/test so stale # pragma: no cover annotations are caught locally before pushing. Updates CLAUDE.md with accurate testing guidance and documents the three coverage-pragma types.

Motivation and Context

PR #2302 passed ./scripts/test (100% coverage, all green) but failed CI in two ways:

strict-no-cover flagged lines 61–63 of src/mcp/server/stdio.py — the new test executed them, but they still carried # pragma: no cover. coverage report saw 100% because the pragma excludes the line; strict-no-cover saw that a "never covered" line was in fact executed.
Python 3.14 Windows branch coverage — a nested async with in the test produced a missing ->exit arc, a known coverage.py quirk.

CLAUDE.md claimed ./scripts/test "matches CI exactly." It didn't: CI runs strict-no-cover as a separate step after coverage report (Linux only), and CI runs the full 3.10–3.14 × {ubuntu, windows} matrix. The first gap is closed here; the second is documented.

CI does not invoke scripts/test — it inlines the coverage commands in shared.yml — so this change is purely local and does not run strict-no-cover twice.

How Has This Been Tested?

Reproduced the #2302 failure mode: added # pragma: no cover to a line that tests/server/test_stdio.py executes, then ran both paths.

Full path (./scripts/test):

Before: coverage report → 100%, exit 0. Stale pragma undetected.
After: coverage report → 100%, then strict-no-cover → ❎ 1 lines wrongly marked, exit 1. ~21s wall time (vs ~20s before).

Fast targeted path (documented in CLAUDE.md):

uv run --frozen coverage erase
uv run --frozen coverage run -m pytest tests/server/test_stdio.py
uv run --frozen coverage combine
uv run --frozen coverage report --include='src/mcp/server/stdio.py' --fail-under=0
UV_FROZEN=1 uv run --frozen strict-no-cover

~4s, catches the stale pragma, no false positives on partial runs (strict-no-cover only reports excluded ∩ executed — fewer executed lines just means fewer chances to catch, never a wrong flag).

uv.lock pollution fix: strict-no-cover internally spawns subprocess.run(['uv', 'run', 'coverage', 'json', ...]) — no --frozen. On machines with registry overrides this rewrites uv.lock. Setting UV_FROZEN=1 propagates to the child process. Verified uv.lock stays clean through a full run.

Breaking Changes

None. ./scripts/test takes ~1s longer and exits nonzero when a stale pragma is found.

Types of changes

Bug fix (non-breaking change which fixes an issue)
Documentation update

Checklist

I have read the MCP Documentation
My code follows the repository's style guidelines
New and existing tests pass locally
I have added appropriate error handling
I have added or updated documentation as needed

Additional context

The # pragma: no branch quirk for nested async with is documented as what-it-is, not what-to-do-about-it; the existing ~100 annotations in the tree are the pattern reference.

PR #2302 passed ./scripts/test (100% coverage) but failed CI when strict-no-cover found lines marked '# pragma: no cover' that the new test actually executed. The claim in CLAUDE.md that ./scripts/test 'matches CI exactly' was false. Adds strict-no-cover as a final step in scripts/test. The tool internally spawns 'uv run coverage json' without --frozen, which rewrites uv.lock on machines with registry overrides; UV_FROZEN=1 propagates to that subprocess. Also converts session.py:426 from 'no cover' to 'lax no cover'. The except-handler during connection cleanup is a genuine race (stream may or may not already be closed depending on timing), so it is nondeterministically covered. strict-no-cover would have flagged it intermittently on high-core machines. CLAUDE.md now documents the fast targeted path (single test file + strict-no-cover, ~4s, no false positives on partial runs), the three pragma types, and the known coverage.py ->exit arc quirk with nested async-with that only the CI matrix catches.

The session.py:426 flag appeared exactly once across 10 runs, and that one run was the only one with a confounding test edit in place. CI has never flagged this line; 8 clean local runs never flagged it either. The original pragma was correct.

claude

LGTM — straightforward CI/DX improvement with documentation updates and a single pragma annotation change.

Extended reasoning...

Overview

This PR adds strict-no-cover as a final step in scripts/test, updates CLAUDE.md with accurate testing guidance and coverage-pragma documentation, and changes one pragma in src/mcp/shared/session.py from # pragma: no cover to # pragma: lax no cover. The session.py change is purely an annotation change (no logic modification) to prevent flaky failures from a nondeterministically-covered exception handler.

Security risks

None. The changes are limited to a shell script (CI tooling), a documentation file, and a comment annotation. No production code paths, authentication, or data handling are affected.

Level of scrutiny

Low scrutiny warranted. This is a developer-experience improvement: the script change adds one line to run an existing tool, the documentation update is comprehensive and accurate, and the pragma change is well-justified in the PR description (race condition in subprocess coverage on high-core machines).

Other factors

The two bug reports found are both minor: coverage.json missing from .gitignore is a nit, and the unquoted $@ is pre-existing. Neither affects the correctness of this PR. The PR description is thorough, explaining the motivation (CI/local parity gap from PR #2302), testing methodology, and the UV_FROZEN=1 rationale for preventing uv.lock pollution.

scripts/test

Drop the subprocess-flake bullet and the 'add this pragma' prescription for no-branch. Scope lax-no-cover to platform-specific coverage only.

claude

LGTM — straightforward CI script addition and documentation update.

Extended reasoning...

Overview

This PR adds a single line to scripts/test (UV_FROZEN=1 uv run --frozen strict-no-cover) and updates CLAUDE.md with accurate testing guidance including coverage pragma documentation. Two files changed, both non-production.

Security risks

None. The changes are to a local developer test script and a documentation file. No production code, auth, or data handling is affected.

Level of scrutiny

Low scrutiny appropriate. This is a developer tooling improvement — adding a linting step to a local test script and correcting documentation that previously overclaimed CI parity. The UV_FROZEN=1 environment variable propagation is a thoughtful detail to prevent uv.lock mutation.

Other factors

My two previous inline comments (missing coverage.json in .gitignore and unquoted $@) are both marked resolved. The new nit about the CI matrix description omitting the lowest-direct dimension is a minor documentation accuracy issue — the text already correctly states the script is "Not identical to CI," which is the important message. The PR is well-motivated by a concrete CI failure (PR #2302) and thoroughly tested as described in the PR body.

claude · 2026-03-17T19:45:52Z

CLAUDE.md

+     - Full check: `./scripts/test` (~23s). Runs coverage + `strict-no-cover` on the
+       default Python. Not identical to CI: CI also runs 3.10–3.14 × {ubuntu, windows},


🟡 The CI matrix description says "3.10–14 × {ubuntu, windows}" (10 entries), but .github/workflows/shared.yml defines a three-dimensional matrix: python-version × dep-resolution × os (5×2×2 = 20 entries), omitting the {locked, lowest-direct} dimension. Consider updating to "CI also runs 3.10–14 × {ubuntu, windows} × {locked, lowest-direct}" — the lowest-direct strategy is especially worth mentioning since those failures cannot be reproduced locally with --frozen.

Extended reasoning...

What the bug is

CLAUDE.md lines 32-33 state:

Not identical to CI: CI also runs 3.10–14 × {ubuntu, windows},
and some branch-coverage quirks only surface on specific matrix entries.

This implies the CI matrix has 5 Python versions × 2 OSes = 10 entries. However, the actual CI matrix in .github/workflows/shared.yml (lines 49-56) defines three dimensions:

matrix: python-version: ["3.10", "3.11", "3.12", "3.13", "3.14"] dep-resolution: - name: lowest-direct install-flags: "--upgrade --resolution lowest-direct" - name: locked install-flags: "--frozen" os: [ubuntu-latest, windows-latest]

That is 5 × 2 × 2 = 20 matrix entries, not 10.

Why the omission matters

The lowest-direct resolution strategy (--upgrade --resolution lowest-direct) tests compatibility with minimum acceptable dependency versions. This is a meaningfully different CI dimension because:

A dependency feature used by MCP might not exist in the lowest-direct resolved version, causing a CI failure that cannot be reproduced locally where --frozen uses the lockfile (latest) versions.

A developer seeing failures only on lowest-direct matrix entries would have no context from CLAUDE.md to understand why — the documentation only mentions Python versions and OS as CI dimensions.

Step-by-step proof

Developer reads CLAUDE.md line 32-33: "CI also runs 3.10–14 × {ubuntu, windows}".

Developer infers the CI matrix is 10 entries varying only by Python version and OS.

Developer pushes a change that uses a dependency feature added in a recent version.

CI fails on lowest-direct matrix entries where the older dependency version lacks that feature.

Developer is confused because the failure does not correlate with any Python version or OS — the two dimensions CLAUDE.md told them about.

Impact

This is a minor documentation inaccuracy. The text already represents a significant improvement over the previous claim that ./scripts/test "matches CI exactly," and the sentence does correctly state that scripts/test is "Not identical to CI." The missing dimension is unlikely to cause real confusion in most cases, but including it would make the documentation precisely accurate and help developers diagnose lowest-direct-specific CI failures.

Suggested fix

Change line 33 from:

CI also runs 3.10–14 × {ubuntu, windows},

to:

CI also runs 3.10–14 × {ubuntu, windows} × {locked, lowest-direct},

maxisbey marked this pull request as draft March 17, 2026 18:19

claude bot reviewed Mar 17, 2026

View reviewed changes

scripts/test Show resolved Hide resolved

scripts/test Show resolved Hide resolved

docs: tighten coverage-pragma guidance

daf837c

Drop the subprocess-flake bullet and the 'add this pragma' prescription for no-branch. Scope lax-no-cover to platform-specific coverage only.

maxisbey marked this pull request as ready for review March 17, 2026 19:26

felixweinberger approved these changes Mar 17, 2026

View reviewed changes

claude bot reviewed Mar 17, 2026

View reviewed changes

maxisbey merged commit ff50351 into main Mar 17, 2026
34 checks passed

maxisbey deleted the fix/local-strict-no-cover branch March 17, 2026 19:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci: run strict-no-cover in scripts/test to catch stale pragmas locally#2305

ci: run strict-no-cover in scripts/test to catch stale pragmas locally#2305
maxisbey merged 3 commits intomainfrom
fix/local-strict-no-cover

maxisbey commented Mar 17, 2026 •

edited

Loading

Uh oh!

claude bot left a comment

Uh oh!

Uh oh!

Uh oh!

claude bot left a comment

Uh oh!

claude bot Mar 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		- Full check: `./scripts/test` (~23s). Runs coverage + `strict-no-cover` on the
		default Python. Not identical to CI: CI also runs 3.10–3.14 × {ubuntu, windows},

Conversation

maxisbey commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation and Context

How Has This Been Tested?

Breaking Changes

Types of changes

Checklist

Additional context

Uh oh!

claude bot left a comment

Choose a reason for hiding this comment

Overview

Security risks

Level of scrutiny

Other factors

Uh oh!

Uh oh!

Uh oh!

claude bot left a comment

Choose a reason for hiding this comment

Overview

Security risks

Level of scrutiny

Other factors

Uh oh!

claude bot Mar 17, 2026

Choose a reason for hiding this comment

What the bug is

Why the omission matters

Step-by-step proof

Impact

Suggested fix

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

maxisbey commented Mar 17, 2026 •

edited

Loading