docs: extend test-anchoring to the capability matrix + file residual issues by hyperpolymath · Pull Request #625 · hyperpolymath/affinescript

hyperpolymath · 2026-06-21T13:03:08Z

Follow-up to #622, doing the two things offered there.

1. Test-anchor the feature matrix (not just soundness)

#622 made docs/SOUNDNESS.adoc test-anchored + CI-gated. This applies the same discipline to the feature-readiness matrix:

docs/CAPABILITY-MATRIX.adoc → new == Test anchors section mapping each load-bearing component (lexer, parser/AST/name-resolution, type checker, effects, borrow checker, QTT/affine, interpreter, traits incl. tracking: CORE-04 traits residual — associated-type substitution, where-clause supertraits, coherence checking #559 coherence, stdlib, typed-wasm, codegen golden, JS-host backends, Solo core) to the test(s) that exercise it.
tools/check-capability-anchors.sh (wired into just guard + CI): fails the build if that section disappears or if any test path the matrix names goes missing — so a works/enforced row can't outlive the test that backs it. Verified it bites (rc=1 on a planted missing anchor) and then reverted.

Each gate now owns one doc: check-soundness-ledger.sh → the soundness ledger, check-capability-anchors.sh → the feature matrix. Same anti-rot principle, symmetric.

2. File the residuals as tracked issues

The "Still open" items in the ledger were prose; now they're tracked:

tracking: interpreter non-tail single-shot resume returns the resumed value (shallow continuation) — needs delimited-continuation/CPS rewrite #623 — interpreter non-tail single-shot resume returns the resumed value (shallow continuation; needs a CPS rewrite of eval, blocked on OCaml 4.14 / jsoo — owner-steer).
tracking: Lean and Why3 (experimental) backends drop return control flow — early returns silently mis-emitted #624 — Lean/Why3 experimental backends drop return control flow (ExprReturn (Some e) -> gen_expr e).
tracking: CORE-04 traits residual — associated-type substitution, where-clause supertraits, coherence checking #559 (already open, the coherence tracker) — updated by comment, not duplicated: concrete-overlap coherence landed in docs: make soundness status a single, test-anchored source of truth #622; residual narrowed to generic-subsumption overlap.

docs/SOUNDNESS.adoc is updated to cite #623/#624 on the residual rows, and its metatheory note now reflects the grown Coq formal/ track (#620/#621: K-1, K-1-let, F-1) rather than just the original #620 seed — keeping the ledger itself fresh.

Verification

All four guard gates green (check-no-extension-ts, check-doc-truthing, check-soundness-ledger, check-capability-anchors). No code touched — docs + one shell gate + justfile/CI wiring.

🤖 Generated with Claude Code

https://claude.ai/code/session_01BbxKhXQwTvVgkYDgBMLJoa

Generated by Claude Code

…issues Follow-up to #622. Applies the same "claims bound to executable tests, gated in CI" discipline to the *feature*-readiness matrix (not just the soundness ledger), and wires in the now-tracked residuals. - docs/CAPABILITY-MATRIX.adoc: add a "== Test anchors" section mapping each load-bearing component (lexer, parser, type checker, effects, borrow checker, QTT, interpreter, traits, stdlib, typed-wasm, codegen, JS-host backends, Solo core) to the test(s) that exercise it. - tools/check-capability-anchors.sh (wired into `just guard` + CI): fails the build if the Test-anchors section disappears or if any test path the matrix names goes missing — so a "works"/"enforced" row cannot outlive its test. Verified it bites (non-zero on a missing anchor). - docs/SOUNDNESS.adoc: wire the residual rows to their new tracking issues (#623 interp non-tail resume, #624 Lean/Why3 return-drop) and refresh the metatheory note to reflect the grown Coq formal/ track (#620/#621: K-1, K-1-let, F-1). Residuals filed as tracked issues so they aren't just prose: - #623 — interpreter non-tail single-shot resume (shallow continuation; needs CPS rewrite, owner-steer). - #624 — Lean/Why3 experimental backends drop `return` control flow. - #559 (already open) updated by comment: concrete-overlap coherence landed; residual narrowed to generic-subsumption overlap. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01BbxKhXQwTvVgkYDgBMLJoa

hyperpolymath · 2026-06-21T13:14:50Z

Closing in favour of a single combined PR (owner's call: fold the capability-anchors work together with the soundness-ledger hardening into one change).

The hardening PR will carry all of this PR's content — the == Test anchors section + tools/check-capability-anchors.sh — plus the three new gate properties (content-binding manifest, stamp-enforcement, table-completeness/pin-liveness via a real xfail harness) and the reconciled ledger rewrite. Nothing here is dropped; it's superseded by the bigger change so the ledger rewrite and the gate land atomically (the rewritten ledger claims enforcement that must exist in the same commit).

Superseding PR to follow.

Generated by Claude Code

…ility anchors Fold of the capability-matrix test-anchoring (closed #625) and the soundness-gate hardening into one change. Makes docs/SOUNDNESS.adoc's claims true by building the mechanism the prose-ahead ledger already promises. Gate tools/check-soundness-ledger.sh: 2 -> 5 enforced properties (anchors exist, back-links, content-binding via tools/soundness-anchors.sha256 + --reseal, stamp-enforcement, xfail pin-liveness). New xfail harness test/xfail/, new fixture stub_backend_return_dropped.affine, capability-matrix Test-anchors section + tools/check-capability-anchors.sh. CI build job fetch-depth: 0 for the stamp diff. Ground-truth corrections (compiler wins): #559 generic-subsumption IS detected (positive test); stub-return uses #624; interp non-tail uses #623. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01BbxKhXQwTvVgkYDgBMLJoa

… xfail pin-liveness (#631) Makes `docs/SOUNDNESS.adoc` keep every promise it makes. The ledger on `main` is "prose ahead of mechanism" (it claims content-binding / stamp-enforcement / pinned xfails, but the gate enforced only 2 of those). This builds the missing mechanism, and folds in the closed-#625 capability-matrix anchoring. ## The five properties (each maps to a function in the gate) | # | Property | Function | Provenance | |---|----------|----------|-----------| | 1 | Anchors exist | `check_anchors_exist` | Jonathan's #622 design (kept) | | 2 | Back-links | `check_backlinks` | Jonathan's #622 design (kept) | | 3 | **Content-binding** | `check_content_binding` + `tools/soundness-anchors.sha256` + `--reseal` | **new** | | 4 | **Stamp-enforcement** | `check_stamp` | **new** | | 5 | **Pin-liveness (xfail)** | `check_pins` + `test/xfail/test_xfail_pins.ml` | **new** | `## What this gate enforces` is documented at the top of the script. Everything **fails closed**. ## Ground-truth correction (compiler wins) Running the compiler showed **#559 generic-subsumption is already detected/rejected** (`impl[T] Greet for Box[T]` vs `impl Greet for Box[Int]` → "Trait coherence violation"). So the ledger's `open (tracked)` "not yet detected" was stale **in the dangerous direction**. Corrected to `fixed` with a positive test; the stale `test_e2e.ml` comment fixed. → one fewer xfail pin than the spec assumed. Also: the stub-return row uses **#624** (the real tracker); #560 is *variable-string wasm ops*, unrelated — this change supplies the pin #628 couldn't (the fixture/test now exist). Stamp re-pointed to `dd6c19e` (a real main-ancestor; the old `d55e22c` was squash-orphaned). Metatheory note updated for the new `formal/` proofs (#620–#627). ## Self-tests — each new check watched failing ``` SELF-TEST 1 — Property 3 (mutate a fixture by one token): ERROR (property 3): anchor content drift vs tools/soundness-anchors.sha256 ... SELF-TEST 2 — Property 4 (un-advanced/orphaned stamp + soundness change): ERROR (property 4): stamp d55e22c is not an ancestor of HEAD; re-point :ground-truth-sha: ... SELF-TEST (5a) — Property 5 (pinned row names a missing pin): ERROR (property 1): test anchor not defined: test_stub_backend_return_DELETED FATAL: anchor test:test_stub_backend_return_DELETED: expected exactly one defining file, found 0 (fail closed) SELF-TEST (5b) — Property 5 (an xfail pin flips to XPASS): ALARM (property 5): pin test_resume_nontail_xfail is PASSING — the hole may be fixed. Open docs/SOUNDNESS.adoc and update the row to 'fixed' (do NOT just silence the pin). ``` Full suite green (534 tests; xfail harness reports both pins `XFAIL-OK`), all four guard gates green, `dune build`/`dune runtest` green at `dd6c19e`. ## Claims I could not make fully mechanical (named, not silently softened) 1. **Content-binding scope.** Fixtures + pinned-test *bodies* are digest-bound (11/12 anchors); the one SUITE-file anchor (`#553` → `test/test_borrow_polonius.ml`) is existence+stamp-checked only — a whole-file hash is too coarse. The ledger sentence was tightened to say exactly this. 2. **Stamp "advanced-in-this-change" detection** is robust for the normal *branch-off-fresh-main* workflow (and the orphaned-stamp case fails closed, self-test 2). It has a known edge in a *multi-commit-since-stamp* history (stamp bumped in an earlier commit, soundness changed again later without re-bump could read as "advanced"); decision-2's full "diff-on-main" freshness check is not separately implemented. Flagged for your call. ## CI `build` job now checks out `fetch-depth: 0` so property 4 can resolve the stamp; the xfail harness is in `.ocamlformat-ignore` (authored without ocamlformat available). 🤖 Generated with [Claude Code](https://claude.com/claude-code) https://claude.ai/code/session_01BbxKhXQwTvVgkYDgBMLJoa --- _Generated by [Claude Code](https://claude.ai/code/session_01BbxKhXQwTvVgkYDgBMLJoa)_ Co-authored-by: Claude <noreply@anthropic.com>

hyperpolymath marked this pull request as ready for review June 21, 2026 13:03

hyperpolymath enabled auto-merge (squash) June 21, 2026 13:07

hyperpolymath disabled auto-merge June 21, 2026 13:14

hyperpolymath closed this Jun 21, 2026

hyperpolymath mentioned this pull request Jun 21, 2026

Harden the soundness-ledger gate: content-binding, stamp-enforcement, xfail pin-liveness #631

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

docs: extend test-anchoring to the capability matrix + file residual issues#625

docs: extend test-anchoring to the capability matrix + file residual issues#625
hyperpolymath wants to merge 1 commit into
mainfrom
claude/capability-test-anchors

hyperpolymath commented Jun 21, 2026

Uh oh!

hyperpolymath commented Jun 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

hyperpolymath commented Jun 21, 2026

1. Test-anchor the feature matrix (not just soundness)

2. File the residuals as tracked issues

Verification

Uh oh!

hyperpolymath commented Jun 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants