Tighten the gate: add ASYNC/LOG/TID/etc. ruff rules + remaining mypy strict flags#153
Conversation
…strict flags The codebase already satisfies several ruff rule categories and the rest of mypy's --strict, so enabling them is forward-looking, zero-churn enforcement rather than a mass autofix: - ruff select += ASYNC (blocking calls in the streaming/agent asyncio code), LOG/G (logging anti-patterns), DTZ (naive datetimes), FLY (static join -> f-string), ICN/SLOT (import/__slots__ hygiene), ISC (implicit string concat, the missing-comma-in-a-list bug; ISC001 left to the formatter), and TID with ban-relative-imports="all" so every intra-package import stays absolute and the import-linter contracts stay unambiguous. - mypy src now runs full --strict except disallow_untyped_calls (jiwer ships no stubs, so wer.py's RemovePunctuation() call would force a net-new # type: ignore the escape-hatch gate rejects). The added flags catch incomplete defs, unchecked untyped bodies, untyped decorators, Any-subclassing, and stale config. Tests relax the untyped-body flags (mock plumbing) but keep the rest. The full scripts/check.sh gate passes.
… check Modeled on denoland/deno's tools/lint.js, which (a) uses per-crate clippy.toml to fence raw std methods to designated crates and (b) fails if any test .out file is unreferenced. - ruff banned-api (TID251) fences raw `subprocess` and `os.environ`/`os.getenv` to the modules that legitimately own them (allowlisted via per-file-ignores): process spawning goes through procs.py, env reads through the config/env layer. A new module reaching for either now trips the gate, so the spread is a visible, reviewable allowlist edit. os.putenv/os.unsetenv are banned outright (they desync os.environ). Tests and scripts/ are exempt; the AST matcher leaves the os.environ snippets inside the code_gen --show-code string templates alone. - scripts/unused_fixtures_gate.py (wired into check.sh) fails on an orphaned .ambr snapshot (no matching test module) or an API fixture no test references. The unit suite runs under xdist, which disables syrupy's own unused-snapshot detection, so this closes that blind spot statically with no extra test run. The full scripts/check.sh gate passes.
- ruff T10 (flake8-debugger): a forgotten breakpoint()/pdb left in shipped code, the debugger counterpart to the T20 print ban. Zero current violations. - codespell (Kubernetes' verify-spelling, generalized): spell-checks code, comments, and docs. Run via `uvx` in check.sh and as a pre-commit hook, so it needs no uv.lock entry; config + ignore-words in [tool.codespell]. - check-case-conflict + detect-private-key (pre-commit-hooks): cross-OS filename collisions (we ship a macOS bottle) and a literal-private-key guard (defense-in-depth alongside gitleaks). - scripts/docs_consistency_gate.py (curl's "every option is documented", generalized): fails if REFERENCE.md/README.md drift from the code — every env var and exit code must be documented, every `assembly …` example must name a real command. - scripts/docstring_coverage_gate.py: public-API docstring-coverage ratchet, an interrogate stand-in (interrogate can't parse the codebase's PEP 695 generics). Floored at the current 64% so it ratchets up without forcing a backfill. - brew audit --strict on the shipped Formula/assembly.rb (Homebrew's own CI check), self-skipping where brew isn't installed. Refleak double-run (CPython's regrtest -R) was evaluated and skipped: it needs a debug build's sys.gettotalrefcount; the achievable part (ResourceWarning) is already enforced via filterwarnings=error. The full scripts/check.sh gate passes.
| for cmd, sub in _CMD_RE.findall(doc.read_text(encoding="utf-8")): | ||
| if cmd not in top: | ||
| errors.append(f"{doc.name}: `assembly {cmd}` names an unknown command") | ||
| elif sub and cmd in groups and sub not in groups[cmd]: |
There was a problem hiding this comment.
_command_ref_errors accepts assembly <leaf-command> <extra> as valid because subcommand validation runs only for grouped commands, so invalid command examples can slip through this gate.
| elif sub and cmd in groups and sub not in groups[cmd]: | |
| elif sub and cmd not in groups: | |
| errors.append(f"{doc.name}: `assembly {cmd} {sub}` names an unknown subcommand") | |
| elif sub and sub not in groups[cmd]: |
Details
✨ AI Reasoning
The command-reference check is intended to ensure documented command examples are real. However, when a documented example has two tokens, the logic only validates the second token if the first token is a command group. If the first token is a leaf command, the extra token is never treated as invalid. That means invalid examples can pass the gate even though they should be rejected by the script's own stated behavior.
Reply @AikidoSec feedback: [FEEDBACK] to get better review comments in the future.
Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info
Merging origin/main brought in tests/test_jsonshape.py, which builds a naive datetime as a deterministic fixture — legitimate in the test suite (TZ is pinned in conftest and time-machine controls the clock), but it tripped the new DTZ rule. Tests are already exempt from the other production-correctness lints (S101, PLR2004, …); add DTZ to that list rather than weaken the test or add an inline noqa the escape-hatch gate would reject. DTZ still guards aai_cli.
The codebase already satisfies several ruff rule categories and the rest of
mypy's --strict, so enabling them is forward-looking, zero-churn enforcement
rather than a mass autofix:
LOG/G (logging anti-patterns), DTZ (naive datetimes), FLY (static join ->
f-string), ICN/SLOT (import/slots hygiene), ISC (implicit string concat,
the missing-comma-in-a-list bug; ISC001 left to the formatter), and TID with
ban-relative-imports="all" so every intra-package import stays absolute and
the import-linter contracts stay unambiguous.
stubs, so wer.py's RemovePunctuation() call would force a net-new # type:
ignore the escape-hatch gate rejects). The added flags catch incomplete defs,
unchecked untyped bodies, untyped decorators, Any-subclassing, and stale
config. Tests relax the untyped-body flags (mock plumbing) but keep the rest.
The full scripts/check.sh gate passes.