fix: resolve Slack directives in section/mrkdwn (verbatim + non-verbatim)#67
Open
programad wants to merge 3 commits into
Open
fix: resolve Slack directives in section/mrkdwn (verbatim + non-verbatim)#67programad wants to merge 3 commits into
programad wants to merge 3 commits into
Conversation
…batim) Slack directives like `<@u123>`, `<#C123>`, `<!subteam^S1|@team>`, `<!channel|here|everyone>`, and `<!date^…>` now fire the matching hooks (`hooks.user`, `hooks.channel`, `hooks.usergroup`, `hooks.atHere` / `atChannel` / `atEveryone`, `hooks.date`) in both `verbatim: true` and `verbatim: false` modes — matching the behaviour of the rich_text path. - Verbatim mode no longer early-returns. It splits the input by directive boundaries and renders each segment, preserving non-directive text literally (no markdown sugar expansion — that's the point of verbatim). - Non-verbatim mode now masks fenced code, inline code, and directive atoms before the URL-rewrite / asterisk-doubling regex pass, then restores them before Yozora parses — so directives can't be mangled by the pre-pass and directive-shaped text inside code stays literal. - Broadcast tokenizer recognises `<!here>` / `<!everyone>` / `<!channel>` natively, alongside the existing `@here` / `@everyone` / `@channel`. - Yozora's `autolink` and `autolink-extension` are unmounted; bare URL autolinking is handled by the existing `<X>` / `<X|Y>` regex rewrite, and the autolink-extension was stealing directives like `<!subteam^S1|@team>` because of the embedded `@`. - mrkdwn sub-element hook payloads now include `style: undefined` to match the rich_text path shape (`{ id, name, style }`). - `<@U…|fallback>` and `<!subteam^S…|@team>` now split on `|` for data lookup + fallback display name (channel mention already did this). - `&` is decoded alongside `>` / `<` in text-object payloads so escaped ampersands don't leak into hrefs or visible text. Adds a vitest test suite (the repo previously had none) covering the directive × verbatim matrix, code-span suppression, escaped entities, hook payload shape, data-map resolution, and non-directive regression. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…avior
Empirically tested in Slack's section/mrkdwn renderer (side-by-side
Block Kit Builder, both verbatim modes). Findings:
- Verbatim:true does NOT suppress markdown formatting (`*bold*`,
`_italic_`, `~strike~`, `` `code` ``) — Slack renders these the same
in both modes.
- Verbatim:true does NOT suppress code spans or angle-bracket URLs.
- The ONE thing verbatim:true does suppress in section/mrkdwn (that
affects this library) is bare-form `@here` / `@channel` / `@everyone`
(without the `<!…>` brackets). Slack interpolates them as chips in
verbatim:false and renders them as plain text in verbatim:true.
Implementation changes:
- Drop the separate "split by directive boundaries, render rest as
literal text" verbatim path. Both modes now flow through the same
pipeline.
- Build two yozora parser instances. The verbatim instance constructs
SlackBroadcastTokenizer with `matchTypedBroadcast: false` so bare
`@here` etc. stay as plain text. Bracket-form `<!here>` etc. still
resolve in both modes.
- Delete `directives.tsx` (no longer needed). Inline DIRECTIVE_PATTERN
helpers into `preparse.ts`.
Tests:
- Flip the two tests that asserted the wrong verbatim behavior
("does NOT bold *text* in verbatim" → "renders *bold* in verbatim
(matches Slack)", same for `<URL>` autolinking).
- Rename code-span suppression block to call out the known divergence
(Slack resolves directives inside code spans; this library doesn't —
tracked as a follow-up since it needs custom tokenization).
- Add per-broadcast-target tests for the new typed-broadcast suppression.
- Add a DOM-equality test confirming verbatim and non-verbatim produce
identical output for non-typed-broadcast content.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
User-typed text in mrkdwn payloads contains Slack-escaped `<`, `>`, `&` chars as `<`, `>`, `&`. Decoding these in `text_object.tsx` BEFORE markdown_parser was incorrect: a literal `<@u123>` typed by a user (delivered as `<@u123>`) decoded to `<@u123>` and got tokenized as a real user mention, firing `hooks.user` and producing a chip when Slack itself would render literal `<@u123>` text. Slack's own renderer tokenizes the raw payload first and decodes entities only at render time — confirmed empirically in section/mrkdwn side-by-side. This change matches that: - Remove entity decoding from text_object.tsx. - Decode in the leaf renderers that emit user-visible text: Text, HTML, InlineCode (and its plain-code path), the fenced Code element, and Link (for the href; label children go through Text). - New tests cover escaped `<@u123>`, `<#C123>`, `<!here>` staying literal, `&` `<` `>` decoding in plain text, and entity decoding inside inline and fenced code. The `>` → "> " trailing-space hack (which existed to make blockquote detection survive entity escaping) is removed. Slack itself does not render blockquotes in section/mrkdwn, so user-typed `> quote` (delivered as `> quote`) no longer renders as a blockquote — which better matches Slack's behavior. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Bug
section/mrkdwntext and other mrkdwn-typed text didn't resolve Slack-formed directive atoms through their matching hooks:verbatim: true— the parser early-returned the raw text wrapped in a<div>, so<@U123>etc. rendered as literal text and no hooks fired.verbatim: false— the existing yozora directive tokenizers mostly fired today, but only because the URL-rewrite regex pass was incidentally protected byisValidURL()(directive strings aren't URLs). The protection was fragile, the regex pass corrupted directive-shaped content inside code spans, and the mrkdwn hook payload shape ({ id, name }) didn't match the rich_text path's shape ({ id, name, style }).text_object.tsxdecoded</>/&before invoking the parser, so user-typed<@U123>(delivered as<@U123>) was decoded to a real-looking<@U123>and then resolved as a user mention, contradicting Slack's renderer which keeps escaped sequences literal.Fix
Both
verbatim: trueandverbatim: falsenow flow through a single mask-transform-restore pipeline. Fenced code, inline code, and directive atoms are masked into placeholders before the URL-rewrite / asterisk-doubling regex pass, then restored before yozora parses. The hook payloads now match the rich_text path's shape.Specific changes:
<!here>/<!everyone>/<!channel>natively (in addition to@here/@everyone/@channel).autolinkandautolink-extensiontokenizers are unmounted. Bare URL autolinking is already handled by the existing<X>/<X|Y>regex rewrite, andautolink-extensionwas stealing directives like<!subteam^S1|@team>because of the embedded@.style: undefinedso the shape matches the rich_text path's{ id, name, style }.<@U…|fallback>and<!subteam^S…|@team>now split on|for data lookup + fallback label (channel mention already did this).</>/&) is deferred from input pre-processing to the leaf renderers (Text,HTML,InlineCode, fencedCode,Link). User-typed text like<@U123>now survives tokenization as literal entities and decodes to plain<@U123>text at render — no false directive match.verbatim— empirically tested in SlackThe original task spec implied
verbatim: trueshould suppress markdown formatting, bare URL autolinking, and typed@channel-style strings, but preserve structured directives. I tested side-by-side in Slack's section/mrkdwn renderer:<@U…>,<#C…>,<!subteam^…>,<!here>,<!date^…>)*bold*,_italic_,~strike~,`code`)<URL>,<URL|label>)`<@U…>`)```\n<@U…>\n```)<@U123>in payload)<@U123><@U123>https://example.com)@here,@channel,@everyone)#general)@somebody):eyes:)*bold _and italic_ here*)Net effect for this library:
verbatim: trueonly behaves differently fromverbatim: falsein one way that matters here — typed bare@here/@channel/@everyoneshould render as plain text. (Bare URL autolinking is the other Slack difference, but this library doesn't autolink bare URLs in either mode.) Implementation: two yozora parser instances; the verbatim instance constructsSlackBroadcastTokenizerwithmatchTypedBroadcast: false. Bracket-form<!here>still resolves in both modes.Known divergences from Slack (tracked separately)
Slack diverges from CommonMark in a few places where this library still follows CommonMark. Out of scope for this PR — they each need their own behavioral discussion and shouldn't ride along:
`<@U123>`to a chip rendered inside the code-styled span; this library leaves it literal. Needs a custom inline-code tokenizer or an AST-walk that splitsinlineCodenode values.>in section blocks; this library renders it as a blockquote. Pre-existing, not introduced by this PR. (One adjacent effect of moving entity decoding to render time: user-typed> quotedelivered as> quoteno longer triggers a blockquote either, which actually aligns better with Slack.)Breaking-change risk
None expected. This is strictly more behavior:
hooks.user/channel/usergrouptypes already declarestyle?as optional, so addingstyle: undefinedis type-compatible.*bold*,_italic_, blockquotes, code spans, lists) renders exactly as before inverbatim: false.verbatim: true, the new behavior is strictly more correct against Slack's own renderer.<@U123>literally no longer produces a false mention chip.Tests
The repo had no test framework. Added vitest +
@testing-library/react+happy-dom. 50 tests covering:|fallbackforms.{ id, name, style }).@here/@channel/@everyonefire hooks in verbatim:false but not in verbatim:true; bracket-form<!here>always fires.&decoding in link URLs, escaped<@U123>/<#C123>/<!here>stay literal,&/</>decode in plain text, and decoding works inside inline and fenced code.Run with
pnpm test.pnpm run lint(tsc) andpnpm run buildalso pass.