Skip to content

fix: resolve Slack directives in section/mrkdwn (verbatim + non-verbatim)#67

Open
programad wants to merge 3 commits into
themashcodee:mainfrom
programad:tirana-v1
Open

fix: resolve Slack directives in section/mrkdwn (verbatim + non-verbatim)#67
programad wants to merge 3 commits into
themashcodee:mainfrom
programad:tirana-v1

Conversation

@programad
Copy link
Copy Markdown
Collaborator

@programad programad commented May 19, 2026

Bug

section/mrkdwn text and other mrkdwn-typed text didn't resolve Slack-formed directive atoms through their matching hooks:

  • verbatim: true — the parser early-returned the raw text wrapped in a <div>, so <@U123> etc. rendered as literal text and no hooks fired.
  • verbatim: false — the existing yozora directive tokenizers mostly fired today, but only because the URL-rewrite regex pass was incidentally protected by isValidURL() (directive strings aren't URLs). The protection was fragile, the regex pass corrupted directive-shaped content inside code spans, and the mrkdwn hook payload shape ({ id, name }) didn't match the rich_text path's shape ({ id, name, style }).
  • Escaped entitiestext_object.tsx decoded &lt; / &gt; / &amp; before invoking the parser, so user-typed <@U123> (delivered as &lt;@U123&gt;) was decoded to a real-looking <@U123> and then resolved as a user mention, contradicting Slack's renderer which keeps escaped sequences literal.

Fix

Both verbatim: true and verbatim: false now flow through a single mask-transform-restore pipeline. Fenced code, inline code, and directive atoms are masked into placeholders before the URL-rewrite / asterisk-doubling regex pass, then restored before yozora parses. The hook payloads now match the rich_text path's shape.

Specific changes:

  • The broadcast tokenizer now matches <!here> / <!everyone> / <!channel> natively (in addition to @here / @everyone / @channel).
  • Yozora's autolink and autolink-extension tokenizers are unmounted. Bare URL autolinking is already handled by the existing <X> / <X|Y> regex rewrite, and autolink-extension was stealing directives like <!subteam^S1|@team> because of the embedded @.
  • mrkdwn sub-element hook payloads now include style: undefined so the shape matches the rich_text path's { id, name, style }.
  • <@U…|fallback> and <!subteam^S…|@team> now split on | for data lookup + fallback label (channel mention already did this).
  • HTML entity decoding (&lt; / &gt; / &amp;) is deferred from input pre-processing to the leaf renderers (Text, HTML, InlineCode, fenced Code, Link). User-typed text like &lt;@U123&gt; now survives tokenization as literal entities and decodes to plain <@U123> text at render — no false directive match.

verbatim — empirically tested in Slack

The original task spec implied verbatim: true should suppress markdown formatting, bare URL autolinking, and typed @channel-style strings, but preserve structured directives. I tested side-by-side in Slack's section/mrkdwn renderer:

Line Verbatim TRUE Verbatim FALSE
structured directives (<@U…>, <#C…>, <!subteam^…>, <!here>, <!date^…>) resolved resolved
markdown sugar (*bold*, _italic_, ~strike~, `code`) formatted formatted
angle-bracket URLs (<URL>, <URL|label>) link link
directive inside inline code (`<@U…>`) resolved (chip rendered inside the code span) resolved
directive inside fenced code (```\n<@U…>\n```) resolved resolved
escaped entities (&lt;@U123&gt; in payload) literal <@U123> literal <@U123>
bare URL (https://example.com) plain text autolinked
typed broadcast (@here, @channel, @everyone) plain text chip
typed channel (#general) red highlight red highlight
typed user (@somebody) plain text plain text
emoji (:eyes:) rendered rendered
nested formatting (*bold _and italic_ here*) both both

Net effect for this library: verbatim: true only behaves differently from verbatim: false in one way that matters here — typed bare @here / @channel / @everyone should render as plain text. (Bare URL autolinking is the other Slack difference, but this library doesn't autolink bare URLs in either mode.) Implementation: two yozora parser instances; the verbatim instance constructs SlackBroadcastTokenizer with matchTypedBroadcast: false. Bracket-form <!here> still resolves in both modes.

Known divergences from Slack (tracked separately)

Slack diverges from CommonMark in a few places where this library still follows CommonMark. Out of scope for this PR — they each need their own behavioral discussion and shouldn't ride along:

  • Directives inside inline code — Slack resolves `<@U123>` to a chip rendered inside the code-styled span; this library leaves it literal. Needs a custom inline-code tokenizer or an AST-walk that splits inlineCode node values.
  • Directives inside fenced code — same story, resolves in Slack but stays literal here.
  • Blockquote in section/mrkdwn — Slack ignores leading > in section blocks; this library renders it as a blockquote. Pre-existing, not introduced by this PR. (One adjacent effect of moving entity decoding to render time: user-typed > quote delivered as &gt; quote no longer triggers a blockquote either, which actually aligns better with Slack.)

Breaking-change risk

None expected. This is strictly more behavior:

  • The hooks.user / channel / usergroup types already declare style? as optional, so adding style: undefined is type-compatible.
  • Non-directive text (URLs, *bold*, _italic_, blockquotes, code spans, lists) renders exactly as before in verbatim: false.
  • In verbatim: true, the new behavior is strictly more correct against Slack's own renderer.
  • Escaped-entity rendering now matches Slack — a user typing <@U123> literally no longer produces a false mention chip.

Tests

The repo had no test framework. Added vitest + @testing-library/react + happy-dom. 50 tests covering:

  • Per-directive × verbatim-mode matrix for user, channel, usergroup, broadcast (bracket form), date — bare and |fallback forms.
  • Hook payload shape parity with rich_text path ({ id, name, style }).
  • Data-map resolution (resolved name when id is known; raw id otherwise).
  • Typed-broadcast suppression: @here / @channel / @everyone fire hooks in verbatim:false but not in verbatim:true; bracket-form <!here> always fires.
  • DOM-equality test: verbatim:true and verbatim:false produce identical output for non-typed-broadcast content.
  • Code-span suppression — flagged as "known divergence from Slack" in the test description so a future reader knows it's deliberate.
  • Escaped entities — &amp; decoding in link URLs, escaped <@U123> / <#C123> / <!here> stay literal, &amp; / &lt; / &gt; decode in plain text, and decoding works inside inline and fenced code.
  • Regression — bold, italic, inline code, angle-bracket URLs.

Run with pnpm test. pnpm run lint (tsc) and pnpm run build also pass.

…batim)

Slack directives like `<@u123>`, `<#C123>`, `<!subteam^S1|@team>`,
`<!channel|here|everyone>`, and `<!date^…>` now fire the matching hooks
(`hooks.user`, `hooks.channel`, `hooks.usergroup`, `hooks.atHere` /
`atChannel` / `atEveryone`, `hooks.date`) in both `verbatim: true` and
`verbatim: false` modes — matching the behaviour of the rich_text path.

- Verbatim mode no longer early-returns. It splits the input by directive
  boundaries and renders each segment, preserving non-directive text
  literally (no markdown sugar expansion — that's the point of verbatim).
- Non-verbatim mode now masks fenced code, inline code, and directive
  atoms before the URL-rewrite / asterisk-doubling regex pass, then
  restores them before Yozora parses — so directives can't be mangled
  by the pre-pass and directive-shaped text inside code stays literal.
- Broadcast tokenizer recognises `<!here>` / `<!everyone>` / `<!channel>`
  natively, alongside the existing `@here` / `@everyone` / `@channel`.
- Yozora's `autolink` and `autolink-extension` are unmounted; bare URL
  autolinking is handled by the existing `<X>` / `<X|Y>` regex rewrite,
  and the autolink-extension was stealing directives like
  `<!subteam^S1|@team>` because of the embedded `@`.
- mrkdwn sub-element hook payloads now include `style: undefined` to
  match the rich_text path shape (`{ id, name, style }`).
- `<@U…|fallback>` and `<!subteam^S…|@team>` now split on `|` for data
  lookup + fallback display name (channel mention already did this).
- `&amp;` is decoded alongside `&gt;` / `&lt;` in text-object payloads
  so escaped ampersands don't leak into hrefs or visible text.

Adds a vitest test suite (the repo previously had none) covering the
directive × verbatim matrix, code-span suppression, escaped entities,
hook payload shape, data-map resolution, and non-directive regression.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@programad programad marked this pull request as draft May 19, 2026 20:41
@programad programad marked this pull request as ready for review May 19, 2026 22:21
@programad programad requested a review from themashcodee May 20, 2026 00:59
programad and others added 2 commits May 20, 2026 12:34
…avior

Empirically tested in Slack's section/mrkdwn renderer (side-by-side
Block Kit Builder, both verbatim modes). Findings:

- Verbatim:true does NOT suppress markdown formatting (`*bold*`,
  `_italic_`, `~strike~`, `` `code` ``) — Slack renders these the same
  in both modes.
- Verbatim:true does NOT suppress code spans or angle-bracket URLs.
- The ONE thing verbatim:true does suppress in section/mrkdwn (that
  affects this library) is bare-form `@here` / `@channel` / `@everyone`
  (without the `<!…>` brackets). Slack interpolates them as chips in
  verbatim:false and renders them as plain text in verbatim:true.

Implementation changes:
- Drop the separate "split by directive boundaries, render rest as
  literal text" verbatim path. Both modes now flow through the same
  pipeline.
- Build two yozora parser instances. The verbatim instance constructs
  SlackBroadcastTokenizer with `matchTypedBroadcast: false` so bare
  `@here` etc. stay as plain text. Bracket-form `<!here>` etc. still
  resolve in both modes.
- Delete `directives.tsx` (no longer needed). Inline DIRECTIVE_PATTERN
  helpers into `preparse.ts`.

Tests:
- Flip the two tests that asserted the wrong verbatim behavior
  ("does NOT bold *text* in verbatim" → "renders *bold* in verbatim
  (matches Slack)", same for `<URL>` autolinking).
- Rename code-span suppression block to call out the known divergence
  (Slack resolves directives inside code spans; this library doesn't —
  tracked as a follow-up since it needs custom tokenization).
- Add per-broadcast-target tests for the new typed-broadcast suppression.
- Add a DOM-equality test confirming verbatim and non-verbatim produce
  identical output for non-typed-broadcast content.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
User-typed text in mrkdwn payloads contains Slack-escaped `<`, `>`, `&`
chars as `&lt;`, `&gt;`, `&amp;`. Decoding these in `text_object.tsx`
BEFORE markdown_parser was incorrect: a literal `<@u123>` typed by a
user (delivered as `&lt;@u123&gt;`) decoded to `<@u123>` and got
tokenized as a real user mention, firing `hooks.user` and producing a
chip when Slack itself would render literal `<@u123>` text.

Slack's own renderer tokenizes the raw payload first and decodes
entities only at render time — confirmed empirically in section/mrkdwn
side-by-side. This change matches that:

- Remove entity decoding from text_object.tsx.
- Decode in the leaf renderers that emit user-visible text: Text, HTML,
  InlineCode (and its plain-code path), the fenced Code element, and
  Link (for the href; label children go through Text).
- New tests cover escaped `<@u123>`, `<#C123>`, `<!here>` staying
  literal, `&amp;` `&lt;` `&gt;` decoding in plain text, and entity
  decoding inside inline and fenced code.

The `&gt;` → "> " trailing-space hack (which existed to make blockquote
detection survive entity escaping) is removed. Slack itself does not
render blockquotes in section/mrkdwn, so user-typed `> quote` (delivered
as `&gt; quote`) no longer renders as a blockquote — which better
matches Slack's behavior.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants