From fb27a7f6c64e2a90ca8c1ee4b83921d981a36cff Mon Sep 17 00:00:00 2001 From: Claude Date: Sun, 21 Jun 2026 13:02:49 +0000 Subject: [PATCH] docs: extend test-anchoring to the capability matrix + file residual issues MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Follow-up to #622. Applies the same "claims bound to executable tests, gated in CI" discipline to the *feature*-readiness matrix (not just the soundness ledger), and wires in the now-tracked residuals. - docs/CAPABILITY-MATRIX.adoc: add a "== Test anchors" section mapping each load-bearing component (lexer, parser, type checker, effects, borrow checker, QTT, interpreter, traits, stdlib, typed-wasm, codegen, JS-host backends, Solo core) to the test(s) that exercise it. - tools/check-capability-anchors.sh (wired into `just guard` + CI): fails the build if the Test-anchors section disappears or if any test path the matrix names goes missing — so a "works"/"enforced" row cannot outlive its test. Verified it bites (non-zero on a missing anchor). - docs/SOUNDNESS.adoc: wire the residual rows to their new tracking issues (#623 interp non-tail resume, #624 Lean/Why3 return-drop) and refresh the metatheory note to reflect the grown Coq formal/ track (#620/#621: K-1, K-1-let, F-1). Residuals filed as tracked issues so they aren't just prose: - #623 — interpreter non-tail single-shot resume (shallow continuation; needs CPS rewrite, owner-steer). - #624 — Lean/Why3 experimental backends drop `return` control flow. - #559 (already open) updated by comment: concrete-overlap coherence landed; residual narrowed to generic-subsumption overlap. Co-Authored-By: Claude Opus 4.8 Claude-Session: https://claude.ai/code/session_01BbxKhXQwTvVgkYDgBMLJoa --- .github/workflows/ci.yml | 7 +++ docs/CAPABILITY-MATRIX.adoc | 32 ++++++++++++++ docs/SOUNDNESS.adoc | 22 +++++----- justfile | 1 + tools/check-capability-anchors.sh | 73 +++++++++++++++++++++++++++++++ 5 files changed, 125 insertions(+), 10 deletions(-) create mode 100755 tools/check-capability-anchors.sh diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index 41163b47..3baf2f15 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -97,6 +97,13 @@ jobs: # PROOF-NEEDS / NAVIGATION / CLAUDE.md) stops linking back to it. # See tools/check-soundness-ledger.sh. run: ./tools/check-soundness-ledger.sh + - name: Capability-matrix test-anchor gate + # docs/CAPABILITY-MATRIX.adoc anchors each feature-readiness claim to an + # executable test ("== Test anchors"). This gate fails if that section + # disappears or if any test it names goes missing — so a "works" status + # row cannot outlive the test that backs it. + # See tools/check-capability-anchors.sh. + run: ./tools/check-capability-anchors.sh - name: Check formatting run: opam exec -- dune build @fmt lint: diff --git a/docs/CAPABILITY-MATRIX.adoc b/docs/CAPABILITY-MATRIX.adoc index d8bcd215..3aa92f7c 100644 --- a/docs/CAPABILITY-MATRIX.adoc +++ b/docs/CAPABILITY-MATRIX.adoc @@ -225,6 +225,38 @@ db-theory triplet — `Sqlite.affine` (#522 / #524 / #525), `Transaction.affine` group-by + aggregation primitives) — are the current authoring frontier. A few primitives remain `extern` builtins. +== Test anchors + +Every enforcement claim in the tables above is anchored to an executable test, +mirroring the link:SOUNDNESS.adoc[soundness ledger]. This makes the claims +*falsifiable*: `tools/check-capability-anchors.sh` (wired into `just check` + +CI) extracts every `test/…` path named below and fails the build if any has +gone missing — so a `works` / `enforced` row cannot keep its status after its +test is deleted or renamed without this section (and CI) noticing. + +[cols="2,4",options="header"] +|=== +|Component (row above) |Anchoring test(s) + +|Lexer |`test/test_lexer.ml` +|Parser / AST / Name resolution |`test/test_e2e.ml`, `test/test_qualified_paths.ml` +|Type checker |`test/test_e2e.ml`, `test/test_slash_effect_row.ml` +|Effect system |`test/test_effects.ml`, `test/test_effect_sites.ml`; fixtures `test/e2e/fixtures/handle_return_arm.affine`, `test/e2e/fixtures/handle_resume_tail.affine` +|Borrow checker |`test/test_borrow_polonius.ml`; fixtures `test/e2e/fixtures/borrow_callee_returned_borrow_uam.affine`, `test/e2e/fixtures/borrow_callee_value_return_ok.affine` +|Quantity / affine (QTT) |`test/test_e2e.ml`; fixtures `test/e2e/fixtures/linear_arrow_violation.affine`, `test/e2e/fixtures/affine_violation.affine`, `test/e2e/fixtures/erased_violation.affine` +|Interpreter |`test/test_e2e.ml`, `test/test_solo_cesk.ml`, `test/test_module_mut.ml` +|Traits (incl. #559 coherence) |`test/test_e2e.ml` (Trait-Coherence suite) +|Stdlib (AOT + laws) |`test/test_stdlib_aot.ml`, `test/test_stdlib_laws.ml` +|typed-wasm isolation |`test/test_tw_isolation.ml` +|Codegen (golden snapshots) |`test/test_golden.ml` +|Deno / JS host backends |`test/test_deno_builtins_consistency.ml`, `test/test_int_div_js.ml` +|Solo core (executable metatheory) |`test/test_solo_cesk.ml` +|=== + +NOTE: this anchors *feature-readiness* claims. *Soundness-hole* anchors live in +their own ledger (link:SOUNDNESS.adoc[SOUNDNESS.adoc]), gated by +`tools/check-soundness-ledger.sh`. + == What AffineScript is NOT (anti-over-claim) * Not "production-ready". Alpha. CORE-01 (#177) closed 2026-05-30. The diff --git a/docs/SOUNDNESS.adoc b/docs/SOUNDNESS.adoc index 61b6811a..70c472ca 100644 --- a/docs/SOUNDNESS.adoc +++ b/docs/SOUNDNESS.adoc @@ -120,7 +120,7 @@ the reverse. This file is the structural fix: tree-walking continuation has already unwound the bind chain. This is the one genuinely-still-silent shape, on the *interpreter* path only. It needs a CPS rewrite of `eval` (blocked: OCaml 4.14 has no native effect handlers and the - interpreter must stay `js_of_ocaml`-compatible — owner-steer item). + interpreter must stay `js_of_ocaml`-compatible — owner-steer item). Tracked: #623. |*residual (pinned)* |`test/e2e/fixtures/handle_resume_nontail.affine` + `test_resume_nontail_known_shallow` (asserts the wrong-but-known value; flips @@ -170,11 +170,12 @@ the reverse. This file is the structural fix: The implementation holes above are closed, but honesty requires naming what is *not* a guarantee: -* *Interpreter non-tail resume* (#555 residual, pinned) — see the ledger row. +* *Interpreter non-tail resume* (#555 residual, pinned; tracked as #623) — see + the ledger row. * *Stub backends drop `return`.* The Lean and Why3 *experimental* code generators drop `return` statements wholesale — a broader codegen-honesty gap - than #555, flagged but not yet fenced. Treat all non-reference backends as - experimental (see `docs/CAPABILITY-MATRIX.adoc`). + than #555, flagged but not yet fenced (tracked as #624). Treat all + non-reference backends as experimental (see `docs/CAPABILITY-MATRIX.adoc`). * *Generic-subsumption coherence* (#559 follow-up) — see the ledger row. == Closed holes are not proofs @@ -182,12 +183,13 @@ The implementation holes above are closed, but honesty requires naming what is Closing an *implementation* hole is not the same as having *metatheory*. The soundness *arguments* for the holes above (`docs/academic/proofs/*.adoc`, the comments in `lib/borrow.ml`) remain `prose`, and the `Solo` core fragment's -`progress` / `preservation` are still `?todo`. Mechanisation has *started* — a -Wave-0 codegen-preservation seed (`formal/K1_CodegenPreservation.v`, axiom-free -Coq/Rocq, minimal nat/bool fragment) landed via #620 — but it does not yet cover -any hole in this ledger. The proof obligations, their rigour tiers, and their -(mostly `prose` / `absent`) status are catalogued in `docs/PROOF-NEEDS.adoc` -(umbrella issue #513). The one-line distinction: +`progress` / `preservation` are still `?todo`. Mechanisation has *started* — an +axiom-free Coq/Rocq `formal/` track (codegen-preservation: K-1 +`K1_CodegenPreservation.v`, K-1-with-`let` `K1Let_CodegenPreservation.v`, and +the F-1 transformer-preservation composition; #620/#621) — but it does not yet +cover any hole in this ledger. The proof obligations, their rigour tiers, and +their (mostly `prose` / `absent`) status are catalogued in +`docs/PROOF-NEEDS.adoc` (umbrella issue #513). The one-line distinction: [quote] The compiler now fails loud on these shapes and the residuals are pinned by diff --git a/justfile b/justfile index 7806d3e8..e823641f 100644 --- a/justfile +++ b/justfile @@ -99,6 +99,7 @@ guard: ./tools/check-no-extension-ts.sh ./tools/check-doc-truthing.sh ./tools/check-soundness-ledger.sh + ./tools/check-capability-anchors.sh # Re-baseline the doc-truthing over-claim ledger after a deliberate, legitimate # change (e.g. a new dated roadmap milestone). Commit the .allow diff. diff --git a/tools/check-capability-anchors.sh b/tools/check-capability-anchors.sh new file mode 100755 index 00000000..02fc932b --- /dev/null +++ b/tools/check-capability-anchors.sh @@ -0,0 +1,73 @@ +#!/usr/bin/env bash +# SPDX-License-Identifier: MPL-2.0 +# Copyright (c) 2026 Jonathan D.A. Jewell +# +# Anti-staleness gate for the capability matrix's test anchors +# (docs/CAPABILITY-MATRIX.adoc "== Test anchors"). +# +# The matrix is the authoritative feature-readiness doc. Its "Test anchors" +# section pins each enforcement claim ("works" / "enforced" / "partial") to the +# executable test(s) that exercise it — the same test-anchoring discipline the +# soundness ledger uses (tools/check-soundness-ledger.sh). This gate makes those +# anchors falsifiable: it extracts every test path the matrix names and fails the +# build if any has gone missing, so a feature cannot keep a green status row +# after its test is deleted or renamed without this section (and CI) noticing. +# +# Checks: +# 1. The matrix exists and carries a "== Test anchors" section. +# 2. Every test/*.ml and test/e2e/fixtures/*.affine path named anywhere in the +# matrix actually exists on disk. +# +# Usage: ./tools/check-capability-anchors.sh +# Wired into: just check (via the `guard` recipe) and CI (.github/workflows/ci.yml). +# Run from anywhere; it cd's to the repo root itself. + +set -euo pipefail + +cd "$(dirname "$0")/.." + +MATRIX="docs/CAPABILITY-MATRIX.adoc" + +fail=0 +note() { printf '%s\n' "$*" >&2; } + +# --- 1. The matrix exists and carries its Test-anchors section --------------- +if [ ! -f "$MATRIX" ]; then + note "ERROR: the capability matrix is missing: $MATRIX" + exit 1 +fi +if ! grep -q "^== Test anchors" "$MATRIX"; then + note "ERROR: $MATRIX lost its '== Test anchors' section." + note " That section anchors each feature-readiness claim to an" + note " executable test. Restore it (see the soundness ledger for the" + note " same pattern: docs/SOUNDNESS.adoc)." + fail=1 +fi + +# --- 2. Every test path the matrix names actually exists --------------------- +missing=0 +while IFS= read -r path; do + [ -z "$path" ] && continue + if [ ! -f "$path" ]; then + if [ "$missing" -eq 0 ]; then + note "ERROR: $MATRIX names test anchors that no longer exist:" + missing=1 + fail=1 + fi + note " - $path" + fi +done < <(grep -oE 'test/[A-Za-z0-9_./-]+\.(ml|affine)' "$MATRIX" | LC_ALL=C sort -u) +if [ "$missing" -eq 1 ]; then + note " Either restore the test or update the matrix to its new anchor" + note " in the same change. A renamed/deleted test must not leave a" + note " feature-readiness claim unmoored." +fi + +if [ "$fail" -ne 0 ]; then + note "" + note "Capability-anchor guard failed. Feature-readiness claims in" + note "$MATRIX are drifting from the tests that back them." + exit 1 +fi + +echo "OK: capability anchors intact — Test-anchors section present + every named test exists."