indent: un-overload YAML semantics off generic flags/literals (#44)#52
Merged
Conversation
A non-YAML indentation grammar inherited three YAML behaviors derived
from flags/literals that mean something else, with no opt-out short of
mis-declaring the grammar. Detach each onto its own explicit, mode-neutral
IndentConfig field that defaults OFF; yaml.ts opts into each.
(A) Flow `:` key/value separator carve-out was derived from the `string`
flag (`stringTokenNames`), silently enlisting every string-region token.
New `flowSeparatorAfterTokens: string[]` names the membership explicitly
(carve-out OFF when empty); `string: true` keeps its region-scoping /
auto-close-derivation jobs without dragging a token into separator
emission. PR #41's wholesale `flowColonSeparator` boolean is removed —
an empty list is the neutral-off it provided, without re-overloading.
(B) Plain-scalar continuation folding was derived from `blockPattern`,
giving YAML folding to any block-pattern token. New `foldTokens:
string[]` names the fold participants explicitly (folding OFF when
empty); the last-named token is the catch-all continuation type. A
grammar can now carry a `blockPattern` token without inheriting the fold.
(C) `keyValueSeparator` was honored by gen-tm but the lexer hardcoded `:`
(and `-`/`?`) in its key-line sniffs, a latent parser/highlighter split.
Route every lexer key-separator sniff through `indent.keyValueSeparator`
(via a shared `keySepAt` helper) and every compact-indicator sniff
through `compactIndicators`, so the lexer and gen-tm share one source of
truth for the separator for any value.
Deferred: (D) the §6.1 tab-in-indentation errors and the value/item-position
classification (seq-item `-` vs explicit-key `?`) still hardcode a few YAML
indicators; cleanly splitting them needs `startsBlockStructuralNode`'s
property/flow/alias indicator set parameterized — a larger sub-task, noted
in-code at each site.
yaml.ts opts in field-by-field (flowSeparatorAfterTokens + foldTokens) and
tokenizes byte-identically: `npm run gen` produces zero generated-file diff
across yaml + ts/js/jsx/tsx/html. test/indent-extensions.ts gains toy
non-YAML grammars proving each un-overload (a `string:true` token that keeps
its `:name` after values; a `blockPattern` token that does not fold; a
`keyValueSeparator:'='` grammar whose lexer treats `=` as structural).
79711e8 to
489343e
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #44.
Decouples the generic indent core from YAML-specific behavior that previously rode on general-purpose token flags and hardcoded literals, so a non-YAML indentation grammar no longer inherits YAML semantics it never asked for. Each behavior is now an explicit, mode-neutral
IndentConfigfield that defaults to off;yaml.tsopts in field-by-field and tokenizes byte-identically.What changed
flowSeparatorAfterTokens: string[](replacesflowColonSeparator: boolean) — explicit membership, by token type, for the flow:key/value separator carve-out, instead of deriving it from thestringflag. A grammar can now flag a tokenstring: true(for string-region scoping / auto-close delimiter derivation) without being dragged into separator emission. Flow-close delimiters are always part of the carve-out; off unless a token is named.foldTokens: string[]— explicit membership for plain-scalar continuation folding, instead of deriving it fromblockPattern. A token can carry ablockPatternwithout inheriting YAML's plain-scalar fold. Off unless declared.keyValueSeparatoris now the single source of truth for the separator glyph in both the lexer's key-line sniffs and the highlighter — they previously disagreed (the lexer hardcoded:,gen-tmread the field), a latent parser↔highlighter split. Compact-indicator sniffs route throughcompactIndicatorslikewise.Breaking change
flowColonSeparator(added in #41) is removed. The carve-out is now off-by-default, so a grammar that setflowColonSeparator: falsesimply drops the field; a grammar that relied on the YAML carve-out names its quoted-key tokens inflowSeparatorAfterTokens. A migration for the one known adopter (NMBL) accompanies this change.Proof
npm run genproduces zero generated-file diff — yaml + ts/js/jsx/tsx/html highlighters/parsers are byte-identical.tm-diagnostics(it requires thevendor/RedCMDsubmodule and throws before tokenizing).core/indent-extensionsis extended with toy non-YAML grammars that assert on the actual token stream: astring: truetoken keeps its:nameafter a value (no auto-enlist), ablockPatterntoken does not fold, and a non-:keyValueSeparatoris honored by the lexer.Deferred
The
§6.1tab-in-indentation errors and the value/item-position classification still hardcode a few YAML indicators (&/!/[/{/*, sequence-, explicit-key?); a clean split needs the indicator set parameterized — noted in #44 as the larger sub-task.