feat(formal): stand up Coq formal/ track + mechanize the K-1 Wave-0 seed by hyperpolymath · Pull Request #620 · hyperpolymath/affinescript

hyperpolymath · 2026-06-21T11:19:04Z

What

Wave 0 of the docs/PROOF-NEEDS.adoc programme, on the keystone obligation K-1 (codegen → typed-WASM semantic-preservation). Stands up the formal/ directory (the dir #513 names) and lands a fully mechanized, axiom-free Coq proof for a minimal fragment.

Prover: Coq/Rocq 8.18 — chosen for K-1 because the typed-WASM target semantics interoperate with typed-wasm and ephapax, both of which use Coq Semantics.v.

The proof — `formal/K1_CodegenPreservation.v`

A complete compiler-correctness theorem with no Admitted, no axioms (Print Assumptions → "Closed under the global context"):

Definition K1_preservation : Prop :=
  forall e v, seval e = Some v -> wexec (compile e) [] = Some [obs v].

Theorem k1_preservation_holds : K1_preservation.   (* proven *)

i.e. whenever the source big-step evaluates to v, the compiled stack-machine code run on the empty operand stack yields exactly the corresponding wasm value [obs v]. The fragment (nat/bool · add/and → a little stack machine standing in for typed-WASM) is deliberately tiny; the real AST + real typed-WASM operational semantics remain the open obligation, expanded later the way solo-core's Duet/Ensemble tracks expand Solo. This mirrors how invariant-path/proofs/SameCube.agda grounds F-2 with a real proof rather than a hole.

Coq `.v` policy carve-out (the point you raised)

.v is shared by Coq, Verilog, and the estate-banned V-lang (→ Zig). Coq is neither — so this PR makes the distinction explicit so nothing can sweep it up:

.hypatia-ignore — explicit formal/*.v exemption from cicd_rules/vlang_detected (+ banned_language_file). Coq ships no v.mod, so vmod_detected never fires.
.claude/CLAUDE.md — new "Formal-methods Coq .v (NOT V-lang)" note: documents the estate path_allow_prefixes carve-out for Coq proof scripts, the no-Admitted/no-axiom rule, the Coq-vs-Idris2 prover split (Coq here for typed-wasm interop; solo-core stays Idris2), and "do not migrate/delete these as V-lang."

Track scaffolding

README.adoc, _CoqProject, justfile (check recipe type-checks and asserts the proof is axiom-free), .gitignore for Coq artifacts.

Docs synced

PROOF-NEEDS.adoc: K-1 prose → partial; formal/ now exists; Wave-0 row marked in progress.
FRG-PROFILE.adoc: the "no formalisation directory" honest-gap is met (grade stays E — D needs type-preservation/progress for the affine calculus, a theorem distinct from this codegen-preservation seed).

How to check

just -f formal/justfile check        # or:  cd formal && coqc K1_CodegenPreservation.v

Requires Coq 8.18+. Verified locally: compiles clean, Print Assumptions closed.

🤖 Generated with Claude Code

https://claude.ai/code/session_01KPG9mEQXFyA3k7NWAzMNMr

Generated by Claude Code

… seed Wave 0 of the PROOF-NEEDS.adoc programme, on the keystone obligation K-1 (codegen -> typed-WASM semantic-preservation). Prover: Coq/Rocq 8.18 — chosen for typed-wasm / ephapax interop (both Coq `Semantics.v`). formal/K1_CodegenPreservation.v proves, with NO `Admitted` and NO axioms (`Print Assumptions`: "Closed under the global context"), a complete compiler-correctness theorem for a minimal AffineScript fragment: Definition K1_preservation : Prop := forall e v, seval e = Some v -> wexec (compile e) [] = Some [obs v]. Theorem k1_preservation_holds : K1_preservation. (* proven *) i.e. source big-step eval ⇒ the compiled stack-machine code yields the corresponding wasm value. The fragment (nat/bool · add/and → a little stack machine) is deliberately tiny; the real AST + real typed-WASM semantics remain the open obligation, expanded later the way solo-core's Duet/Ensemble tracks expand Solo. This mirrors how SameCube.agda grounds F-2 with a real proof rather than a hole. Coq `.v` policy carve-out (the `.v` extension is shared with the banned V-lang and with Verilog — Coq is neither): - `.hypatia-ignore`: explicit `formal/*.v` exemption from `cicd_rules/vlang_detected` (+ banned_language_file), so no sweep can mis-flag Coq as V-lang. Coq has no `v.mod` → `vmod_detected` never fires. - `.claude/CLAUDE.md`: new "Formal-methods Coq `.v` (NOT V-lang)" note documenting the carve-out, the no-Admitted/no-axiom rule, and that these files must not be migrated/deleted as V-lang. Track scaffolding: README.adoc, _CoqProject, justfile (`check` recipe type-checks and asserts the proof is axiom-free), .gitignore for Coq artifacts. Docs synced: PROOF-NEEDS.adoc K-1 prose→partial + `formal/` now exists; FRG-PROFILE.adoc "no formalisation directory" gap met (grade stays E — D needs type-preservation/progress for the affine calculus, distinct from this codegen-preservation seed). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01KPG9mEQXFyA3k7NWAzMNMr

github-actions · 2026-06-21T11:19:57Z

🔍 Hypatia Security Scan

Findings: 41 issues detected

Severity	Count
🔴 Critical	2
🟠 High	23
🟡 Medium	16

⚠️ Action Required: Critical security issues found!

View findings

[
  {
    "reason": "Action denoland/setup-deno@v2 needs attention",
    "type": "unpinned_action",
    "file": "publish-jsr.yml",
    "action": "pin_sha",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Issue in instant-sync.yml",
    "type": "secret_action_without_presence_gate",
    "file": "instant-sync.yml",
    "action": "peter-evans/repository-dispatch",
    "rule_module": "workflow_audit",
    "severity": "high"
  },
  {
    "reason": "Shell execution -- validate input before passing to shell (1 occurrences, CWE-78)",
    "type": "js_exec_sync",
    "file": "/home/runner/work/affinescript/affinescript/packages/affinescript-cli/mod.js",
    "action": "flag",
    "rule_module": "code_safety",
    "severity": "high"
  },
  {
    "reason": "Shell execution -- validate input before passing to shell (2 occurrences, CWE-78)",
    "type": "js_exec_sync",
    "file": "/home/runner/work/affinescript/affinescript/packages/affine-vscode/mod.js",
    "action": "flag",
    "rule_module": "code_safety",
    "severity": "high"
  },
  {
    "reason": "Shell execution -- validate input before passing to shell (1 occurrences, CWE-78)",
    "type": "js_exec_sync",
    "file": "/home/runner/work/affinescript/affinescript/affinescript-vite/src/affine-plugin-improved.js",
    "action": "flag",
    "rule_module": "code_safety",
    "severity": "high"
  },
  {
    "reason": "expect() in hot path (32 occurrences, CWE-754)",
    "type": "expect_in_hot_path",
    "file": "/home/runner/work/affinescript/affinescript/affinescriptiser/src/codegen/wasm_gen.rs",
    "action": "flag",
    "rule_module": "code_safety",
    "severity": "medium"
  },
  {
    "reason": "expect() in hot path (29 occurrences, CWE-754)",
    "type": "expect_in_hot_path",
    "file": "/home/runner/work/affinescript/affinescript/affinescriptiser/src/codegen/affine_gen.rs",
    "action": "flag",
    "rule_module": "code_safety",
    "severity": "medium"
  },
  {
    "reason": "unsafe block -- requires SAFETY comment (2 occurrences, CWE-676)",
    "type": "unsafe_block",
    "file": "/home/runner/work/affinescript/affinescript/runtime/src/panic.rs",
    "action": "flag",
    "rule_module": "code_safety",
    "severity": "medium"
  },
  {
    "reason": "unsafe block -- requires SAFETY comment (1 occurrences, CWE-676)",
    "type": "unsafe_block",
    "file": "/home/runner/work/affinescript/affinescript/runtime/src/alloc.rs",
    "action": "flag",
    "rule_module": "code_safety",
    "severity": "medium"
  },
  {
    "reason": "unsafe block -- requires SAFETY comment (3 occurrences, CWE-676)",
    "type": "unsafe_block",
    "file": "/home/runner/work/affinescript/affinescript/runtime/src/ffi.rs",
    "action": "flag",
    "rule_module": "code_safety",
    "severity": "medium"
  }
]

Powered by Hypatia Neurosymbolic CI/CD Intelligence

…622) ## Why A reader (human or agent) asking "is AffineScript sound / what's broken?" could open any of ~6 docs and get a **stale answer in the dangerous direction**. The known holes `#554/#555/#556/#558/#559` were fixed, fenced, or removed across 2026-05/06, but their status was duplicated across the capability matrix, a `CLAUDE.md` survey block, `STATE-*` snapshots, the wiki, etc. — surfaces that drift independently. (This PR was prompted by exactly that: a stale answer sourced from `CAPABILITY-MATRIX.adoc` + the `CLAUDE.md` survey.) This fixes the **content and the layout** so it can't silently rot again. ## What **New single source of truth — `docs/SOUNDNESS.adoc`** - The one place soundness-hole status lives. **Test-anchored**: every row names the fixture/test that proves it. - Carries a freshness stamp (`:ground-truth-sha:` + date). Ground-truthed against a green `dune build` / `dune runtest` at `d55e22c`. - Honest about residuals (interpreter non-tail resume, Lean/Why3 `return`-drop, #559 generic overlap) and about the implementation-vs-proof distinction. **New anti-staleness gate — `tools/check-soundness-ledger.sh`** (wired into `just guard` + CI) - Fails the build if the ledger loses its primacy declaration or freshness stamp, if any **anchor fixture goes missing**, or if a status surface stops linking back to the ledger. This binds prose to executable truth. - Verified it *bites*: returns non-zero and names the offender on a missing anchor. **Corrected every live status surface** to ground-truth + made them defer to the ledger: `README`, `CAPABILITY-MATRIX` (borrow/effects/refinement/traits rows + anti-over-claim bullet + See-also), `PROOF-NEEDS` (holes block + P-9/P-10), `NAVIGATION`, `reference/COMPILER-CAPABILITIES`, `TECH-DEBT` (CORE-04/05), the wiki (README + traits + dependent-types), `STATE.a2ml`, agent debt, and the `CLAUDE.md` survey. Dated snapshot `STATE-2026-06-11` is **capped with a superseded banner**, not rewritten. ## Ground-truth recorded (verified in source + green suite) | Issue | Was documented as | Actually | |---|---|---| | #554 | open use-after-move | **fixed** — rejected `MoveWhileBorrowed` | | #555 | silently mis-lowered | **fenced loud** on every compiled backend; 1 pinned interp residual | | #556 | silent sync fallback | **fixed** — fails loud | | #558 | parse-only/unenforced | **removed** in v1; `assume(...)` rejected at parse | | #559 | coherence unchecked | **fixed** for concrete overlaps (wired in `typecheck.ml`) | | #553 | "0% implemented" | M1–M3, **test-only/unwired** | Closing these *implementation* holes is **not** the same as *proving* soundness — the metatheory is still prose (one Wave-0 Coq seed from #620 noted), per `PROOF-NEEDS.adoc`. ## Note for review - The `.claude/CLAUDE.md` change is **body-only** (the stale survey → a deferral to the ledger). I did **not** add or alter its license header — that's the owner-gated act flagged in the soundness handoff. - No code touched; docs + one shell gate + CI/justfile wiring. `dune build` and `dune runtest` are green at `d55e22c`. - `AFFIRMATION.adoc` (a parked, dated attestation) was deliberately left untouched. 🤖 Generated with [Claude Code](https://claude.com/claude-code) https://claude.ai/code/session_01BbxKhXQwTvVgkYDgBMLJoa --- _Generated by [Claude Code](https://claude.ai/code/session_01BbxKhXQwTvVgkYDgBMLJoa)_ Co-authored-by: Claude <noreply@anthropic.com>

… xfail pin-liveness (#631) Makes `docs/SOUNDNESS.adoc` keep every promise it makes. The ledger on `main` is "prose ahead of mechanism" (it claims content-binding / stamp-enforcement / pinned xfails, but the gate enforced only 2 of those). This builds the missing mechanism, and folds in the closed-#625 capability-matrix anchoring. ## The five properties (each maps to a function in the gate) | # | Property | Function | Provenance | |---|----------|----------|-----------| | 1 | Anchors exist | `check_anchors_exist` | Jonathan's #622 design (kept) | | 2 | Back-links | `check_backlinks` | Jonathan's #622 design (kept) | | 3 | **Content-binding** | `check_content_binding` + `tools/soundness-anchors.sha256` + `--reseal` | **new** | | 4 | **Stamp-enforcement** | `check_stamp` | **new** | | 5 | **Pin-liveness (xfail)** | `check_pins` + `test/xfail/test_xfail_pins.ml` | **new** | `## What this gate enforces` is documented at the top of the script. Everything **fails closed**. ## Ground-truth correction (compiler wins) Running the compiler showed **#559 generic-subsumption is already detected/rejected** (`impl[T] Greet for Box[T]` vs `impl Greet for Box[Int]` → "Trait coherence violation"). So the ledger's `open (tracked)` "not yet detected" was stale **in the dangerous direction**. Corrected to `fixed` with a positive test; the stale `test_e2e.ml` comment fixed. → one fewer xfail pin than the spec assumed. Also: the stub-return row uses **#624** (the real tracker); #560 is *variable-string wasm ops*, unrelated — this change supplies the pin #628 couldn't (the fixture/test now exist). Stamp re-pointed to `dd6c19e` (a real main-ancestor; the old `d55e22c` was squash-orphaned). Metatheory note updated for the new `formal/` proofs (#620–#627). ## Self-tests — each new check watched failing ``` SELF-TEST 1 — Property 3 (mutate a fixture by one token): ERROR (property 3): anchor content drift vs tools/soundness-anchors.sha256 ... SELF-TEST 2 — Property 4 (un-advanced/orphaned stamp + soundness change): ERROR (property 4): stamp d55e22c is not an ancestor of HEAD; re-point :ground-truth-sha: ... SELF-TEST (5a) — Property 5 (pinned row names a missing pin): ERROR (property 1): test anchor not defined: test_stub_backend_return_DELETED FATAL: anchor test:test_stub_backend_return_DELETED: expected exactly one defining file, found 0 (fail closed) SELF-TEST (5b) — Property 5 (an xfail pin flips to XPASS): ALARM (property 5): pin test_resume_nontail_xfail is PASSING — the hole may be fixed. Open docs/SOUNDNESS.adoc and update the row to 'fixed' (do NOT just silence the pin). ``` Full suite green (534 tests; xfail harness reports both pins `XFAIL-OK`), all four guard gates green, `dune build`/`dune runtest` green at `dd6c19e`. ## Claims I could not make fully mechanical (named, not silently softened) 1. **Content-binding scope.** Fixtures + pinned-test *bodies* are digest-bound (11/12 anchors); the one SUITE-file anchor (`#553` → `test/test_borrow_polonius.ml`) is existence+stamp-checked only — a whole-file hash is too coarse. The ledger sentence was tightened to say exactly this. 2. **Stamp "advanced-in-this-change" detection** is robust for the normal *branch-off-fresh-main* workflow (and the orphaned-stamp case fails closed, self-test 2). It has a known edge in a *multi-commit-since-stamp* history (stamp bumped in an earlier commit, soundness changed again later without re-bump could read as "advanced"); decision-2's full "diff-on-main" freshness check is not separately implemented. Flagged for your call. ## CI `build` job now checks out `fetch-depth: 0` so property 4 can resolve the stamp; the xfail harness is in `.ocamlformat-ignore` (authored without ocamlformat available). 🤖 Generated with [Claude Code](https://claude.com/claude-code) https://claude.ai/code/session_01BbxKhXQwTvVgkYDgBMLJoa --- _Generated by [Claude Code](https://claude.ai/code/session_01BbxKhXQwTvVgkYDgBMLJoa)_ Co-authored-by: Claude <noreply@anthropic.com>

hyperpolymath marked this pull request as ready for review June 21, 2026 11:25

hyperpolymath merged commit ac6e574 into main Jun 21, 2026
16 checks passed

hyperpolymath deleted the claude/lucid-cray-4a22dp branch June 21, 2026 11:25

hyperpolymath mentioned this pull request Jun 21, 2026

docs: make soundness status a single, test-anchored source of truth #622

Merged

This was referenced Jun 21, 2026

docs: extend test-anchoring to the capability matrix + file residual issues #625

Closed

Harden the soundness-ledger gate: content-binding, stamp-enforcement, xfail pin-liveness #631

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(formal): stand up Coq formal/ track + mechanize the K-1 Wave-0 seed#620

feat(formal): stand up Coq formal/ track + mechanize the K-1 Wave-0 seed#620
hyperpolymath merged 1 commit into
mainfrom
claude/lucid-cray-4a22dp

hyperpolymath commented Jun 21, 2026

Uh oh!

github-actions Bot commented Jun 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

hyperpolymath commented Jun 21, 2026

What

The proof — formal/K1_CodegenPreservation.v

Coq .v policy carve-out (the point you raised)

Track scaffolding

Docs synced

How to check

Uh oh!

github-actions Bot commented Jun 21, 2026

🔍 Hypatia Security Scan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

The proof — `formal/K1_CodegenPreservation.v`

Coq `.v` policy carve-out (the point you raised)