Prove TextMate-generator completeness (#51) by johnsoncodehk · Pull Request #55 · johnsoncodehk/monogram

johnsoncodehk · 2026-06-20T05:36:27Z

Closes #51. Proves src/gen-tm.ts complete: every TextMate-representable highlighting obligation a grammar induces is emitted and reachable. The dual of the soundness ledger (KNOWN-GAPS.md, which finds wrong paints) — this proves there are no missing ones.

What now holds

The generator's input is a closed, finite algebra (RuleExpr / TokenPattern), so "every obligation" is enumerable. Completeness reduces to three mechanically-checked layers:

closure — toRuleExpr, the token-pattern compiler, and the shared collectLiterals backbone are total over their unions, so no supported combinator shape is silently dropped;
coverage — every non-skip token has a discharge path (the token census), and every content/keyword obligation leaf is painted (2433/2433 across the six grammars);
reachability — every emitted repository key is reachable from the root patterns or a declared export surface (#expression / canonicalRepoNames / aliasScopes); zero dead keys.

Fixed-denominator ledger: 3137/3137 = 100%, 0 open completeness gaps.

The gap it found and fixed

The Layer-A audit surfaced one latent silent-drop — getTypeParamElementKeywords ignored sep, dropping a keyword nested in a sep-list within a type-parameter element. Fixed by recursing into sep.element (not / ref stay omitted on purpose). The six shipped grammars are byte-identical (the drop is latent), with a biting regression guard in the checker.

The frontier is not an impossibility

Three "TextMate can't express this" claims — variable-width lookbehind for the cast/arrow value test, the balanced-paren arrow confirm, and a regex after a control-flow head — were adversarially attacked and refuted: each is expressible in vscode-oniguruma (verified), and the fixed-width forms gen-tm emits are deliberate Onigmo-portability choices. They are a soundness-precision axis, not a completeness gap.

Deliverable

COMPLETENESS.md — the proof spine (contract, three layers, frontier, ledger), modelled on TOTAL-PARSING.md.
test/tm-completeness.ts — mechanises it; npm run completeness[:check], wired into npm run check as a gate.

Distinct from the README corpus metrics (empirical oracle agreement) and the soundness ledger (scope-gap / gap-ledger) — this is the formal derivation property. Full gate suite: 39/39.

Add a formal completeness proof for src/gen-tm.ts: for every grammar expressible through the public src/api.ts combinators, every TextMate-representable highlighting obligation is emitted and reachable. This is the dual of the soundness ledger (KNOWN-GAPS.md, which finds WRONG paints) — here we prove there are no MISSING ones. The proof rests on the generator's input being a closed, finite algebra (RuleExpr / TokenPattern), which makes "every obligation" enumerable and reduces completeness to three mechanically-checked layers: - closure — toRuleExpr, the token-pattern compiler, and the shared collectLiterals backbone are total over their unions, so no supported combinator shape is silently dropped; - coverage — every non-skip token has a discharge path (the token census), and every content/keyword obligation leaf is painted (2433/2433 across the six grammars), on a fixed denominator; - reachability — every emitted repository key is reachable from the root patterns or a declared export surface (#expression / canonicalRepoNames / aliasScopes); zero dead keys. The Layer-A audit surfaced one latent silent-drop: getTypeParamElementKeywords omitted `sep`, so a keyword in a sep-list within a type-parameter element lost its keyword role inside `<…>`. Fixed by recursing into sep.element (`not`/`ref` stay omitted on purpose — a forbidden word / a constraint type's own keywords must not be hoisted); the six shipped grammars are byte-identical (the drop is latent), and the checker carries a biting regression guard. Three sites that looked like TextMate impossibilities — variable-width lookbehind for the cast/arrow value test, the balanced-paren arrow confirm, and a regex after a control-flow head — were attacked and refuted: each is expressible in vscode-oniguruma (verified), and the fixed-width forms gen-tm emits are deliberate Onigmo-portability choices. They are a soundness-precision axis, not a completeness gap. COMPLETENESS.md is the proof spine; test/tm-completeness.ts mechanises it (npm run completeness[:check]) and joins `npm run check` as a gate.

A passing checker means nothing if the checker is blind. Add test/tm-mutation.ts: it injects a catalogue of known gaps into the emitted grammar (drop a key / all of a token's includes, neuter a scope to the bare root, add a dead key, a dangling include, mis-scope a token to a wrong role, reorder two disambiguation patterns) and records which detector layer kills each — measuring the detector's power instead of asserting it. Measured: every PRESENCE / REACHABILITY gap is killed corpus-free (16/16, 12/16 by reachability / token-census / the new flat-token neuter check); WRONG-ROLE gaps are caught only by a differential witness (presence ≠ correctness); ORDERING gaps are a measured blind spot — TextMate is order-sensitive and pattern rank lives in the emitted artifact, not the grammar algebra, so no corpus-free structural check reaches it. This is the honest boundary, now empirical: the structural proof covers presence + reachability; ordering / correctness are the soundness axis, reached only by evaluation. The over-claim of an a-priori "no gap can hide" over the whole gap space is dropped — COMPLETENESS.md states the bounded, measured claim. The harness motivated one detector strengthening: tokenCensus now flags a flat token NEUTERED to the bare root scope (an entry that exists but paints inert document text), moving that gap class from differential-only to corpus-free. Wired as a meta-gate in `npm run check`.

The keyword/operator obligation was the one class still checked by the grammar-derived corpus (leaf coverage). Replace it with a structural, a-priori discharge: `literalDischarge` confirms every alphabetic literal the grammar consumes (collectLiterals over every rule + the prec/led tables) appears, as a scoped word, in some REACHABLE pattern whose scope is a keyword family — a finite scan of the emitted artifact, no corpus. 248/248 across the six grammars; non-vacuous (stripping `class` from its patterns reports it undischarged). Completeness is now a decidable structural check end to end — token discharge (census, incl. neuter) + keyword-literal discharge + repository reachability, 952/952 = 100%, no corpus. The leaf-coverage corpus pass is demoted to a redundant differential cross-check on the soundness axis; `tm-mutation`'s structural layer now also kills a dropped/neutered keyword corpus-free. COMPLETENESS.md draws the line correctly: COMPLETENESS (present + reachable + scoped) is decidable — finite G, finite gen-tm(G), an obligation taxonomy bounded by TextMate's finite construct kinds, per-token discharge by structural identity (the flat `match` IS the token's own pattern) — and ∀ G by structural induction over the finite combinator algebra. SOUNDNESS (do the present constructs paint correctly on all inputs — wrong-role, pattern ordering) is the undecidable residual (CFG vs regex-stack-machine over infinite input). The earlier "a-priori completeness is unavailable" was an over-concession: that was soundness's wall, mistaken for completeness's.

An adversarial gap-hunt (5 agents over the classes the structural checker does NOT cover) found that the getTypeParamElementKeywords `sep` drop had un-fixed siblings in the same family: - detectTypeParamConstraintKeywords.scanConstraint omitted `sep`, so a type-parameter CONSTRAINT keyword reached through a `&`/`,`-separated list was not collected — the .tsx generic-arrow⇄JSX no-comma disambiguation lost its constraint signal (mis-scoping the header as a JSX tag). - detectDeclarations.containsBlockRef omitted `sep`, so a declaration whose brace body is reached through a `sep` was not seen as having a body — its #declaration-body member-scoping region was dropped. Both recurse into `sep.element` now, mirroring the prior fix. Byte-identical on all six shipped grammars (latent: every shipped grammar writes constraints as `opt('extends', Type)` and block bodies as direct refs). The hunt found 8 latent completeness gaps total; all were VERIFIED latent — tokenizing the witnesses against the shipped grammars shows ternary, calls, trailing-comma type params, and every prec operator are correctly scoped, so the 0-gap soundness ledger is real, not corpus-blind. The remaining gaps are detector SHAPE-FRAGILITY (fixed-offset window matching in detectTernary / detectCallExpression / isAngleBracketSepRule misses equivalent factorings), an unimplemented `rawBlock` config region, and punctuation-in-region — tracked for follow-up; none triggers on a shipped grammar.

The fixed-window detector fragility the gap-hunt surfaced was a symptom; the root cause is that the shape-detectors pattern-match RAW RuleExpr shapes, so an equivalent factoring of the same construct (an `opt`-tail, a separate args rule, a trailing comma) is a different shape and slips them. Adding a `sep` arm per detector only widens the symptom patch — the next factoring still slips. Fix the structural condition: match the NORMALISED form. expandAlts already canonicalises alt-split / opt-tail / group / quantifier into the same flat adjacency; route the three fragile detectors through it, plus FIRST-set for the one ref-hidden case: - detectTernary, isAngleBracketSepRule → expandAlts (opt-tail and trailing-comma factorings now reduce to the matched adjacency; `sep` stays opaque so the sep node survives). - detectCallExpression → FIRST(next) instead of a literal `(`, so the args may be the inline `(` OR a separate `CallArgs` rule referenced after the callee. Byte-identical on all six shipped grammars (they already write the canonical factoring); the three latent gaps are now emitted for their equivalent forms (verified: #ternary-expression / #function-call / #declaration-type-params). And the detector is no longer blind to this class: a new shape-robustness gate (test/tm-completeness.ts) asserts each construct emits its region for EVERY equivalent factoring — it bites (3 failures) without the normalization. So the class is now caught structurally, not just by an adversarial hunt.

A systematic shape-fragility audit (5 agents running the real emitter over equivalent factorings of each detector's construct) found the fixed-window fragility is broad, not three isolated cases. Fix four more by routing through the normalized form, same pattern as detectTernary: - detectConditionalType — ran its 7-window on raw r.body; now over expandAlts, so an `opt`-tail / grouped / alt-split conditional `?:` is detected. - getTypeParamElementKeywords — early-out demanded an exact 3-item `'<' sep '>'` body; now scans the expandAlts branches for that adjacency, so a trailing `opt(',')` or alt-wrapped type-param list still hoists its constraint keyword. - detectDeclarations.isBlockRule — matched only a raw-seq `{ … }` body; now EVERY expandAlts branch must be `{ … }`-bounded (`.every`, not `.some`, so a `Type` that is only SOMETIMES an object literal is not mis-read as a brace declaration — `.some` regressed `type X = …` to #declaration-body). - detectJsx hasElementShape — matched `<`+ref only in a raw seq; now over the expandAlts branches, so an opt/alt/group factoring of the element qualifies. Byte-identical on all six shipped grammars (verified the `.every` form after a `.some` first attempt changed typescript/tsx output), and each fix is verified to detect its previously-dropped factoring. Seven detectors are now shape-robust; the YAML region detectors and the expression group are the remaining batch.

Reading each expression detector myself (the audit agent's output was malformed): five matched a construct on the raw r.body, so an equivalent factoring slips them. Route them through expandAlts, same pattern as detectTernary/detectCallExpression: - detectBareArrowParam — `ref '=>'`; an opt-tail arrow (`[x, opt('=>', body)]`) was dropped (verified: variable.parameter now emitted for that factoring). - detectPropertyAccess — `'.'`/`'?.'` before a token ref. - detectParenArrowParams + detectArrowParamDelims — the deliberate pair that read the same arrow param-list production; routed identically so they still cannot disagree. - detectDirectParamKeywords — keyword directly before `(`; also recurse `sep`. detectConstructorKeywords already expands (no change). Byte-identical on all six shipped grammars. Twelve detectors are now shape-robust. The YAML region detectors are deferred to a semantics-aware pass: their fixed shapes encode DELIBERATE YAML semantics (detectFold stops at a leaf token and does NOT follow a rule-ref by design — an Indent+rule-ref is a SIBLING node, not a fold), so the audit's heuristic "follow the ref" fix would break the fold-vs-sibling meaning. Not every fixed shape is fragility — some are intent.

) Worked the YAML region detectors semantics-aware (not the audit's heuristic). Key finding: the audit's "route through expandAlts" fix is WRONG for these — they match STRUCTURAL nodes (a `(Newline item)*` quantifier, config-keyed bracket pairs), and expandAlts EXPANDS the very quantifier they depend on. - detectBlockSequence — the one genuine positional rigidity: it matched the `[item, (Newline item)*]` pattern only at items[0]/items[1]. Now scanned pairwise (any adjacent k), so a leading element before the sequence does not hide it. NOT routed through expandAlts (that would expand the quantifier). yaml byte-identical; the full yaml gate group (depth-witnesses, deepest-sibling, compact-nest-sites, flow-sites, blockscalar-depth, issue12) stays green. - detectFold — its visit is ALREADY pairwise; its refsLeaf stopping at a leaf token (not following a rule-ref) is the DELIBERATE fold-vs-sibling distinction (an Indent+rule-ref is a sibling node). No change. - detectExplicitKey — the indicator at items[0] is intrinsic (an explicit-key entry is headed by `?`); the inner already unwraps a quantifier. No change. - detectFlowCollections — topLits is positional-agnostic and unwraps quantifier/group/sep, deliberately stopping at alt/ref to read THIS rule's own structure. No change. So the fixed-window fragility class is closed: 13 detectors made shape-robust where the rigidity was accidental; the rest is deliberate YAML semantics that the heuristic over-flagged. Not every fixed shape is fragility.

getTypeParamElementKeywords collected only KEYWORD literals, so a literal the grammar declares a scope for that is PUNCTUATION (e.g. a `&` scoped punctuation.separator) inside a type-parameter element lost its scope inside `<…>`. Collect any scope-declared literal now, and emit it with the right boundary (`\b` for words, none for punctuation — `\b&\b` never matches). `=` and `,` are skipped in the loop since the dedicated handlers emit them, so the six shipped grammars stay byte-identical (verified the `=` double-emit and excluded it); a scoped `&` in a type-param element is now scoped (verified). This closes the last clean completeness gap from the hunt. The remaining two are not clean completeness gaps: the symbolic-operator case is an ORDERING concern (a short overridden op can shadow a longer non-overridden one across the separate #operator-overrides / #operators patterns — the provably-hard ordering axis, and the op IS scoped, just shadowed), and `rawBlock` is a declared-but-unimplemented IndentConfig field (no shipped grammar sets it; implementing the verbatim region speculatively, with no adopter to verify against, is deferred).

An adversarial review of the "proven no gaps" conclusion found real holes; this acts on them. - Markup co-blindness (the one fixable defect): tokenCensus discharged EVERY markup token via `if (g.markup) bump('markup-region')` with zero verification, so a markup token whose declared scope generateMarkupTm does not model (e.g. a `<?…?>` processing instruction) was reported discharged while the engine painted it the bare root. Now a markup token with an explicit `scope` must have that scope actually emitted, or it is an orphan (verified: the PI counterexample is now caught; html stays 7/7). - COMPLETENESS.md corrections, all overclaims the review caught: - "per-token discharge by structural identity (match IS tokenPatternSource)" was false — the identifier match is widened to identPattern, and the census checks PRESENCE (a reachable non-root-scoped entry), not regex identity. Reworded to presence-not-identity. - "∀ G by structural induction" conflated the algebra CLOSURE (which is ∀ G) with the per-grammar DISCHARGE (executed on the shipped set). Separated. - "ordering is undecidable" was the project's own impossibility-without-proof trap: for a fixed G the pattern list is finite and the winner is a finite index read (gen-tm even sorts it deterministically). It is "not reachable by a corpus-free structural fold," a measurement limit, not undecidability. Residual, honestly stated: the shape class is MEASURED not proven — the robustness gate covers 3 of ~20 detectors, mutation testing mutates only flat tokens (not shape regions), and detectAngleBracketCast keeps a latent ref-factored fragility (type-cast fires on no shipped grammar). 81/81 · 40/40.

johnsoncodehk added 10 commits June 20, 2026 13:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prove TextMate-generator completeness (#51)#55

Prove TextMate-generator completeness (#51)#55
johnsoncodehk wants to merge 10 commits into
masterfrom
tm-completeness-proof

johnsoncodehk commented Jun 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

johnsoncodehk commented Jun 20, 2026

What now holds

The gap it found and fixed

The frontier is not an impossibility

Deliverable

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant