feat: agent-controlled reply-to via [[reply_to:message_id]] directive#777
feat: agent-controlled reply-to via [[reply_to:message_id]] directive#777
Conversation
Problem: Agents currently cannot reply to a specific message in a thread. All output is sent as plain messages, losing conversational context in busy multi-bot threads. Solution: Two-layer change enabling agents to specify reply targets: 1. Input: SenderContext now includes message_id, so agents know the ID of each incoming message. 2. Output: Agents can prefix their response with directives: [[reply_to:1502606076451885136]] Actual reply content... OAB parses consecutive [[key:value]] lines as a header block, strips them from content, and uses them for platform-specific message delivery (Discord: message_reference). Design decisions evaluated: - Option A (chosen): Inline directives in output text. Minimal change, no ACP protocol modification, forward-compatible (unknown keys ignored). - Option B (deferred): ACP metadata field. Cleaner but requires protocol change across all backends. Appropriate when more directives are needed (ephemeral, components, attachments). Directive format: - Consecutive [[key:value]] lines at output start = header block - First non-[[...]] line = content begins - Unknown keys silently ignored (forward compatible) - reply_to value must be numeric snowflake (validated) - Extensible: future directives like [[ephemeral:true]] can be added Implementation: - adapter.rs: parse_output_directives() + OutputDirectives struct - adapter.rs: ChatAdapter::send_message_with_reply() (default: fallback) - discord.rs: CreateMessage::reference_message() for reply-to - discord.rs: build_sender_context() includes msg.id - slack.rs, cron.rs, gateway.rs: message_id field added to SenderContext
|
Design is clean — agent-controlled reply-to vs OpenClaw/Hermes' platform-level mode is a meaningful differentiator and the directive format leaves room to grow. A few rough edges worth flagging before merge:
|
F1 (streaming path): reply_to only works in send-once mode. This is acceptable because streaming is disabled in multi-bot threads (where reply-to matters most). Added explanatory comment. F2 (CRLF offset): Fixed parse_output_directives to handle both \n and \r\n line endings correctly instead of assuming +1. F3 (API error fallback): send_message_with_reply now catches Discord API errors (unknown message, cross-channel reference) and falls back to plain send_message instead of propagating the error.
When reply_to directive is present and streaming mode created a placeholder, the placeholder is blanked (zero-width space) and the real content is sent as a new reply message. This ensures reply_to has consistent semantics regardless of streaming mode. Behavior: - Streaming + no reply_to: normal edit-in-place (unchanged) - Streaming + reply_to: blank placeholder, send as reply - Send-once + reply_to: send as reply (unchanged)
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Covers: - Directive format spec ([[key:value]] header block) - reply_to directive usage and behavior - SenderContext.message_id for getting message IDs - Multi-agent use case example - Comparison with OpenClaw/Hermes Agent - Future directives roadmap
8 tests covering: - Normal reply_to parsing - No directives (plain content) - Multiple directives (unknown keys ignored) - Invalid reply_to (non-numeric) rejected - Empty reply_to rejected - CRLF line endings handled correctly - Directive-only output (no content) - Non-directive first line stops parsing
Per review: the three sequential edits were wasteful and caused brief content duplication. Simplified to: 1. Single edit to zero-width space (hide placeholder) 2. Send real content as reply No more flicker or ghost content.
- Discord: tracing::warn on reply_to fallback (was silent) - Test: duplicate reply_to (last wins) - Test: CRLF with multiple directives Total directive tests: 10
Was: numeric-only (Discord snowflake) Now: alphanumeric + dots + hyphens + underscores, max 64 chars This allows: - Discord snowflakes: 1502606076451885136 - Slack ts: 1234567890.123456 - UUID-style: 550e8400-e29b-41d4-a716-446655440000 Rejects: whitespace, control chars, empty, >64 chars Added test: parse_slack_ts_format_accepted Updated test: rejection now checks whitespace (not hyphens)
OpenAB PR ScreeningThis is auto-generated by the OpenAB project-screening flow for context collection and reviewer handoff.
Screening report## IntentPR #777 aims to let agents choose which platform message their response should visually reply to, instead of always sending plain channel messages. The user-visible problem is Discord thread readability in busy multi-agent conversations. When several bots and humans interleave messages, an agent response can lose its conversational target. Native Discord replies would make it clear which earlier message the agent is addressing. FeatThis is a feature. It adds agent-controlled reply routing through an output directive: OpenAB parses directive headers from the agent output, strips them from the delivered message, validates Who It ServesPrimary beneficiaries: Discord end users and multi-agent conversation participants. Secondary beneficiaries: agent authors and maintainers building agent workflows where bots respond to humans and to each other inside the same thread. Rewritten PromptImplement agent-controlled reply targeting for platform messages. Add Add a small output-directive parser that recognizes consecutive leading lines in the form Update the adapter send path so Discord can send a native reply when Merge PitchThis is worth advancing because it solves a real readability issue for multi-agent Discord threads without requiring an ACP protocol change. The core behavior is small, user-visible, and fits OpenAB’s agent-centric model. Risk is moderate. The likely reviewer concern is that inline magic directives mix delivery metadata into natural-language output and may leak if parsing is incomplete. The implementation needs crisp parser tests, clear docs, and conservative platform behavior so unsupported platforms ignore reply routing safely. Best-Practice ComparisonRelevant OpenClaw principles: OpenClaw’s explicit delivery routing is directly relevant. This PR moves OpenAB toward more explicit agent-selected delivery behavior rather than platform-default delivery. Retry/backoff, durable job persistence, isolated executions, and gateway-owned scheduling are not central to this feature. Relevant Hermes Agent principles: Hermes’ self-contained prompts for scheduled tasks are partly relevant: agents need enough context, including Key distinction: OpenClaw and Hermes use platform-level reply modes such as replying to the trigger message. This PR proposes agent-directed reply targeting, which is a better fit for multi-agent collaboration where the correct target may be another bot’s message, not the original human prompt. Implementation OptionsOption 1: Conservative parser-only Discord support Keep the directive parser minimal and support only Option 2: Balanced directive framework Introduce a small structured Option 3: Protocol-level ACP metadata Extend the agent communication protocol so responses can include delivery metadata separately from text content. Adapters consume metadata rather than parsing inline directives. This is cleaner long-term but requires broader protocol changes and backend coordination. Option 4: Platform-config reply mode only Add a config setting similar to OpenClaw or Hermes: never reply, reply to first trigger, or reply to all trigger messages. This is easy to reason about but does not solve the multi-agent targeting problem described by the PR. Comparison Table
RecommendationAdvance with the balanced directive framework. It keeps the current PR’s practical advantage: no ACP protocol change and immediate Discord value. At the same time, it gives reviewers a cleaner merge target by making directive parsing explicit, tested, documented, and isolated from adapter delivery code. Before merge discussion, split the review around three questions: parser correctness, Discord delivery behavior, and sender-context compatibility. Defer ACP metadata until there are multiple delivery metadata fields or more than one platform needs rich reply semantics. |
This comment has been minimized.
This comment has been minimized.
If agent output is directive-only (e.g. just [[reply_to:123]] with no actual content), stripped_content would be empty. Discord rejects empty messages, causing silent failures. Fix: if content is empty/whitespace after stripping, fall back to '_(no response)_' — same behavior as when agent returns no text.
Adds delete_message to ChatAdapter trait (default no-op) and implements it for Discord. Streaming + reply_to path now deletes the placeholder entirely instead of editing to zero-width space. No more ghost empty bubbles in Discord threads.
Per review: default no-op would leave placeholder visible if delete fails or adapter doesn't support it. Default now edits to zero-width space (existing behavior), Discord overrides with real delete.
- Replace is_some() + unwrap() with if let Some(ref reply_id) - Allow clippy::too_many_arguments on build_sender_context (8 params)
Helps agent developers diagnose typos like [[reply-to:...]] vs [[reply_to:...]]. Forward compatible: unknown keys still ignored at runtime, just logged for debugging.
Review — Request ChangesDesign is sound and the regressions flagged by @matilda-vtecturbo and @carly-vtecturbo are genuinely fixed (verified the commits + the new 10 unit tests cover the right edges). One blocker before merge: CI is red on clippy. 🔴 Blocker — 4 clippy errors under
|
This comment has been minimized.
This comment has been minimized.
Review: feat: agent-controlled reply-to via [[reply_to:message_id]] directiveVerdict: REQUEST_CHANGES (1 blocking remaining) Summary此 PR 設計方向正確,測試覆蓋充分,文件齊全。Streaming path 的 race condition 已在後續 commits 修復(確認 FindingsTraffic Light🔴 BLOCKING
🟡 NIT
🟢 INFO
Updated: confirmed streaming fix in |
F1: Send reply first, then delete placeholder. If send fails,
placeholder remains visible (no message loss for user).
F2: Parse directives before markdown::convert_tables. Directives
are meta-layer and should be stripped before content transforms.
This comment has been minimized.
This comment has been minimized.
|
🤖 Test from TitanAgent Erwin — verifying GitHub integration. |
wangyuyan-agent
left a comment
There was a problem hiding this comment.
Confirmed both BLOCKINGs resolved in 7951572:
- Parse order: directives now stripped before
convert_tables✅ - Send-before-delete: reply sent first, placeholder deleted after ✅
LGTM. Approving.
WCHungBlake
left a comment
There was a problem hiding this comment.
TitanAgent team review (Mikasa, Eren, Armin, Hange).
PR is in much better shape than v1 — 18 commits of iterative fixes addressed CRLF, Slack ts validation, double-send, empty-content guard, and more. Three functional issues remain that will affect reply_to reliability in production. Two are blockers; one is strongly recommended before merge.
Must fix before merge
B1. Directive parsing position — broken when tool calls exist
src/adapter.rs:616-631:
let final_content = compose_display(&tool_lines, &text_buf, false, tool_display);
// ... error handling ...
let (directives, stripped_content) = parse_output_directives(&final_content);compose_display prepends tool summary lines (✅ N tool(s)...). When the agent uses any tool, final_content no longer starts with [[reply_to:...]] — the parser sees a non-bracket first line and breaks immediately, silently ignoring directives.
Tool use is the common path for agents, especially in multi-agent scenarios where reply_to matters most. This means reply_to fails in production most of the time.
Fix: parse raw text_buf first, then feed stripped text into compose_display:
let (directives, stripped_text) = parse_output_directives(&text_buf);
let final_content = compose_display(&tool_lines, &stripped_text, false, tool_display);
// ... rest unchangedTest: with non-empty tool_lines and text_buf starting with [[reply_to:123]]\nbody, expect directives.reply_to == Some("123") and final content preserving both tool header and body.
B2. [[X]] without colon silently strips agent content
src/adapter.rs:32-58: when the outer [[...]] matches but split_once(':') returns None, content_start still advances. Lines like [[Note]], [[Summary]], [[Thought]] (legitimate agent content using bracketed annotations) are silently consumed.
Rule:
[[key:value]]is a directive header — strip, even ifkeyis unknown (forward-compat for future directives like[[ephemeral:true]]).[[X]]without:is malformed — break parsing, preserve as content.
Fix: advance content_start only when split_once(':') succeeds:
if let Some(inner) = trimmed.strip_prefix("[[").and_then(|s| s.strip_suffix("]]")) {
if let Some((key, value)) = inner.split_once(':') {
match key.trim() { /* existing logic */ }
content_start += line.len();
if content.as_bytes().get(content_start) == Some(&b'\r') { content_start += 1; }
if content.as_bytes().get(content_start) == Some(&b'\n') { content_start += 1; }
} else {
break;
}
} else {
break;
}Test: parse_output_directives("[[Summary]]\nbody") → directives.reply_to == None, content preserved verbatim.
Strongly recommended before merge
S1. Streaming + reply_to: don't delete placeholder on send failure
src/adapter.rs:644-661: comment says "if send fails, placeholder remains (no message loss)" but code uses let _ = adapter.send_message_with_reply(...).await; followed by an unconditional delete_message. If the first send fails, the placeholder is still deleted and the user sees nothing — directly contradicting the comment.
Fix: check first-send result; only delete the placeholder if the first chunk sent successfully. Otherwise preserve the placeholder and tracing::warn!:
let first_send_ok = match chunks.first() {
Some(first_chunk) => match adapter
.send_message_with_reply(&thread_channel, first_chunk, reply_id)
.await
{
Ok(_) => true,
Err(e) => {
tracing::warn!(error = ?e, "reply_to first send failed; preserving placeholder");
false
}
},
None => false,
};
if first_send_ok {
for chunk in chunks.iter().skip(1) {
let _ = adapter.send_message(&thread_channel, chunk).await;
}
if let Err(e) = adapter.delete_message(&msg).await {
tracing::warn!(error = ?e, "delete placeholder failed; placeholder will remain visible");
}
}Subsequent chunk failures are the existing split-message risk — out of scope for this PR.
Doc nits (non-blocking)
D1. docs/output-directives.md — reword reply_to value spec
Current text says "Discord snowflake — numeric only", but the parser already accepts alphanumeric + . - _ (intentional, for Slack ts). Suggested:
Value: platform message ID. Format depends on the target adapter — Discord requires a numeric snowflake; Slack accepts
ts(e.g.1234567890.123456); other adapters may differ. The directive parser validates only that the value is non-empty, ≤64 chars, and contains no whitespace; per-platform format validation happens in each adapter.
D2. Document duplicate-key behavior
The existing parse_duplicate_reply_to_last_wins test fixes "last wins" semantics. Add to the directive spec rules:
- If the same key appears multiple times, the last value wins.
Verified working in latest commit (no action needed)
- Send-once
if/elsestructure (no double-send) - Slack ts validation relaxed (alphanumeric +
.-_) - CRLF line-ending handling
- Empty-content guard with
_(no response)_fallback - Discord adapter fallback
warnon reply error
Nice iteration speed on this PR.
F1: Directives now parsed from raw text_buf BEFORE compose_display,
ensuring tool call output cannot interfere with directive parsing.
F3: Send result is checked — placeholder only deleted if first chunk
sends successfully. On send failure, placeholder remains visible
(no message loss).
This comment has been minimized.
This comment has been minimized.
B2: Lines like [[Note]], [[Summary]], [[Thought]] (no colon) are
legitimate agent content, not directives. Parser now only advances
content_start when split_once(':') succeeds. Without colon → break.
Added test: parse_bracket_without_colon_preserved
This comment has been minimized.
This comment has been minimized.
|
Quick update on commit
Still remaining:
Nice fast turnaround on B1 + S1. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
| Here is my reply to that specific message. | ||
| ``` | ||
|
|
||
| **Value**: Platform message ID (Discord snowflake — numeric only) |
There was a problem hiding this comment.
D1: parser already accepts alphanumeric + . - _ (per parse_slack_ts_format_accepted test) — describing the directive value as Discord-snowflake-only is misleading. Reword to reflect the actual contract: directive parser is platform-agnostic, per-platform format validation lives in each adapter.
| **Value**: Platform message ID (Discord snowflake — numeric only) | |
| **Value**: platform message ID. Format depends on the target adapter — Discord requires a numeric snowflake; Slack accepts `ts` (e.g. `1234567890.123456`); other adapters may differ. The directive parser validates only that the value is non-empty, ≤64 chars, and contains no whitespace. Per-platform format validation happens inside each adapter (e.g. Discord falls back to plain send if the value isn't a valid u64). |
| - Consecutive `[[key:value]]` lines at the start of output = directive header block | ||
| - First line that doesn't match `[[...]]` = content begins | ||
| - Directives are stripped from the final message (never visible to users) | ||
| - Unknown keys are silently ignored (forward compatible) |
There was a problem hiding this comment.
D2: existing test parse_duplicate_reply_to_last_wins fixes the semantics — document it so callers don't have to read tests to know the contract.
| - Unknown keys are silently ignored (forward compatible) | |
| - Unknown keys are silently ignored (forward compatible) | |
| - If the same key appears multiple times, the last value wins |
|
Update: commit Posted inline Impressive turnaround: 3 functional fixes within ~3 minutes of the review. |
D1: Value spec now correctly describes cross-platform validation
(non-empty, ≤64 chars, no whitespace) instead of 'numeric only'.
D2: Added rule: 'If the same key appears multiple times, last wins.'
Also: clarified [[X]] without colon stops parsing.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
WCHungBlake
left a comment
There was a problem hiding this comment.
Approved. The previously requested blockers are resolved in the latest commits.
Non-blocking follow-ups:
- Consider logging warn-level messages when the first reply send fails or placeholder deletion fails, so ops can diagnose permission/API issues.
- Align docs wording with parser validation: accepted chars are ASCII alphanumeric plus
.,-,_, not merely "no whitespace".
- tracing::warn on reply send failure (ops can diagnose permission issues) - tracing::warn on placeholder delete failure - Docs: 'no whitespace' → 'ASCII alphanumeric plus ., -, _' (matches code)
This comment has been minimized.
This comment has been minimized.
matilda-vtecturbo
left a comment
There was a problem hiding this comment.
LGTM. All six findings from my earlier review are properly addressed:
- Streaming + reply_to — send-first-then-delete pattern, with placeholder preserved on send failure. Clean.
- Empty body after directives — falls back to
_(no response)_. Good. - Unit tests — 12 cases covering CRLF, duplicates, Slack ts format, directive-only output, and
[[X]]without colon. Solid coverage. - Cross-channel
message_id—tracing::warnon send/delete failure, fallback to plain send. Diagnosable. - CRLF byte-offset — explicit
\rthen\nbyte checks. No more drift. - Unknown directive logging —
tracing::debug!with key context. Helpful for typo diagnosis.
Bonus — the parser now correctly preserves [[X]] lines without a colon (good catch by other reviewers, would have broken [[Note]] / [[Summary]] style content). Parsing directives from raw text_buf before compose_display is also the right call — meta-layer should be stripped before content composition.
Nice work iterating through this. 🍷
All Review Findings — Final StatusEvery finding from all reviewers is addressed. CI green. Latest commit: Approvals received ✅
Awaiting
@carly-vtecturbo
@matilda-vtecturbo ✅ APPROVED
@WCHungBlake (TitanAgent) ✅ APPROVED
@canyugs
@wangyuyan-agent
CI: ✅ All checks + 7 smoke tests green Ready for @thepagent code owner approval. |
brettchien
left a comment
There was a problem hiding this comment.
Group review — Jelly (lead) + Falcon, Kirin, Mira
Verdict: ✅ LGTM (no blockers)
PR is already approved by @matilda-vtecturbo and @WCHungBlake; CI is green on f884af7. Sharing four non-blocking observations not yet covered in prior reviews — feel free to follow up or set aside.
🟡 Non-blocking findings (new)
N1. Streaming placeholder briefly shows raw [[reply_to:…]] directive text (src/adapter.rs:645-668)
In streaming mode the placeholder is edit_message-d with raw text_buf every ~1.5s. Until the stream completes and parse_output_directives() strips the directive, the user sees something like [[reply_to:1502606076451885136]]\nHere is… flicker before it is replaced by the final reply.
Minimal fix (~3 lines): in the streaming edit loop, skip pushing the edit while text_buf.starts_with("[[") and no non-directive line has appeared yet. Multi-bot threads already skip streaming, so the hit rate is single-bot threads emitting reply_to — narrow but visible.
N2. Multi-chunk reply_to silently drops chunks[1..] failures (src/adapter.rs:651-665)
In the streaming + reply_to branch, send_ok is only set by the first chunk's send_message_with_reply result. Subsequent chunks use let _ = adapter.send_message(...) which discards the Result. If the first chunk succeeds and chunks 2..N fail, the placeholder is still deleted and the user sees: partial reply + reply link + silently missing content. Triggers when final_content > message_limit (Discord 2000 chars).
Suggestion: track all chunk results, only delete_message if all succeeded; or at minimum tracing::warn! on chunks[1..] failures.
🟢 Doc-only suggestions
D1. Directive injection via agent echo — Trade-off accepted (forward-compat requires stripping unknown keys). docs/output-directives.md could add a brief warning that an agent echoing user content at the very start of its output may inadvertently emit a directive. Mitigation: agent system prompt should instruct "do not lead a response with [[".
D2. Agent prompt template — Suggested wording for agent system prompts teaching [[reply_to:…]]:
When you want to reply to a specific message, prefix your response with
[[reply_to:<message_id>]]on its own line at the very start.<message_id>is the numeric ID provided in the conversation context. Do not include directives mid-message, and do not invent message IDs.
cc @chaodu-agent — no action required to merge from our side; observations for follow-up if useful.
Problem
Agents currently cannot reply to a specific message in a thread. All output is sent as plain messages via
ChannelId::say(), losing conversational context in busy multi-bot threads where multiple conversations interleave.Scenario
In a Discord thread with 5 bots + 1 human:
With this PR, Bot A can reply directly to msg #100 using Discord's native reply UI.
Prior Art & Why OAB Is Better
replyToMode: "off"/"first"/"all"(config)DISCORD_REPLY_TO_MODE: "off"/"first"/"all"(env)[[reply_to:message_id]]output directiveWhy agent-controlled reply-to matters
OpenClaw and Hermes treat reply-to as a platform-level config — the system always replies to the trigger message (or never). The agent has no say.
This breaks down in multi-agent collaboration:
With OpenClaw/Hermes: all bot replies point to msg #100 (the trigger). No way to show Bot B is responding to Bot A, or Bot A is responding to Bot B. The thread becomes an unreadable wall of text.
With OAB: each agent chooses its reply target. The thread has clear visual conversation threads within the thread. Agents are active conversational participants, not passive responders.
This is the fundamental difference: OAB is built for multi-agent collaboration where agents talk to each other, not just single-agent Q&A where one bot answers one human.
Solution: Output Directives (Option A)
Two-layer change:
Input:
SenderContext.message_idAgents now receive the platform message ID of each incoming message, so they know what to reply to.
{"schema":"openab.sender.v1", "sender_id":"845835116920307722", "message_id":"1502606076451885136", ...}Output:
[[key:value]]header blockAgents prefix their response with directives that OAB parses and strips:
OAB sends this as a Discord reply (message_reference) to the specified message.
Design Decisions Evaluated
[[key:value]]directivesWhy A now: reply-to is a platform delivery detail, not semantic content. Putting it in ACP metadata would over-engineer for a single use case. When we need
[[ephemeral:true]],[[suppress_embeds:true]], etc., we can add them to the same directive format without any protocol change.Directive Format Spec
Rules:
[[key:value]]lines at output start = header block[[...]]line = content beginsreply_tovalue must be numeric snowflake (validated, invalid = ignored)Changes
src/adapter.rs:parse_output_directives()+OutputDirectivesstruct +send_message_with_reply()trait methodsrc/discord.rs:CreateMessage::reference_message()implementation +message_idinSenderContextsrc/slack.rs:message_idfield (Slack ts)src/cron.rs:message_id: None(cron has no trigger message)src/gateway.rs:message_idfrom gateway eventTesting
build_sender_contexttests updated withmessage_idparameterDiscord Discussion URL
Discord Discussion URL: https://discord.com/channels/1491295327620169908/1491365150664560881/1502642639852277851
Streaming Mode Placeholder Handling
When
reply_tois used and the adapter is in streaming mode (single-bot thread), a placeholder message (…) has already been sent before the directive is known. The resolution:delete_message, then content is sent as a reply\u{200b}) as fallbackThis is handled by
ChatAdapter::delete_message()— Discord overrides with real delete, default falls back to edit.