Problem
Agentic workflows that span many tool calls (write files, run commands, call GitHub APIs) give users no control point between "I described what I want" and "the agent already did it."
The only safeguard today is an instruction asking agents to present a plan and wait — a soft control that works most of the time but has no runtime enforcement, and cannot be toggled at runtime without modifying config files.
Two concrete gaps:
- No user-controlled mode switch. There is no way to type
/plan in the TUI to enter a careful, plan-then-approve mode for a risky task — and then /plan off to return to fluid execution for a simple one.
- No cost-optimised planning. Planning requires deep reasoning (costly model, used once). Execution is mechanical (cheaper model, used N times). There is no way to route the planning turn to a different model than the execution turns.
Prior Art
Every mature agentic system has converged on this pattern:
| System |
Plan format |
User toggle |
Dual-model? |
| Cursor |
Structured file list with diff preview |
Ask vs Agent mode; /multitask slash command; per-subagent model setting (May 2026) |
Yes — subagent model configurable separately from parent |
| GitHub Copilot Workspace |
JSON plan_action function call |
Phase gate: Task → Plan → Code; natural-language modify |
Yes — o1-preview for planning, gpt-4o for execution |
| OpenAI Operator |
JSON schema |
Structured approval gate in UI |
Yes — reasoning model for plan, fast model for execution |
| Devin |
JSON step tree |
Checkpoint-based pauses; user steers mid-execution |
Implicit (planning phase uses extended thinking budget) |
| Claude extended thinking |
<thinking> block + text response |
Application-layer convention (no built-in gate) |
Single model, separate token budget for reasoning |
| LangGraph |
Free-form or JSON |
HumanNode blocks graph traversal |
Framework-dependent |
Key pattern across all: plan generation is separated from execution, the plan is shown to the user, and execution is gated on approval.
Cursor model-routing (verified, May 2026 changelog):
"Added the ability to control Explore subagent behavior from settings: choose a specific model for Explore subagents to run on, inherit the same model as the parent agent, or disable Explore subagents altogether."
"Added support for general model names for subagent configuration."
"/multitask is now available in the editor for running async subagents."
Anthropic extended thinking API (confirmed from SDK):
response = client.messages.create(
model="claude-sonnet-4-5",
thinking={"type": "enabled", "budget_tokens": 1600},
...
)
# Response contains separate `thinking` block and `text` block.
# The approval gate is the application's responsibility — no built-in pause.
Proposed Solution
Three composable additions:
1. submit_plan built-in tool
Agent calls submit_plan(title, steps, rationale) before any side-effecting action. The runtime renders the plan in the TUI, pauses, and waits for explicit user approval.
{
"title": "Implement configurable retry policy",
"steps": [
{
"description": "Add RetryConfig struct to pkg/config/latest/types.go",
"tool": "edit_file",
"target": "pkg/config/latest/types.go",
"expected_outcome": "RetryConfig embedded in ToolsetConfig"
},
{
"description": "task build / task test / task lint",
"tool": "shell",
"target": "task build && task test && task lint"
}
],
"rationale": "Follows existing sessions config pattern; backwards compat — defaults to no retry."
}
Tool output on approval:
{ "approved": true, "plan_id": "uuid" }
On rejection with user note:
{ "approved": false, "reason": "Don't touch dispatcher.go yet, design first" }
TUI rendering (proposed):
╭─ Plan: "Implement configurable retry policy" ──────────────────────╮
│ 1. Edit pkg/config/latest/types.go → add RetryConfig struct │
│ 2. Edit pkg/runtime/dispatcher.go → read RetryConfig per toolset │
│ 3. Update agent-schema.json │
│ 4. task build / task test / task lint │
│ │
│ Rationale: follows sessions config pattern; backwards compat. │
│ │
│ [Approve ↵] [Reject esc] [Modify: type changes ...] │
╰─────────────────────────────────────────────────────────────────────╯
2. /plan TUI command (session-level toggle)
/plan → enter plan mode for this session
/plan off → exit plan mode; return to normal execution
/plan status → show current mode
When plan mode is active, the runtime sets an in-memory flag that enforces plan_mode: strict for the session — no config file change needed. Users decide per-task whether they want a careful review cycle or fluid execution.
This follows the same pattern as Cursor's /multitask command: a slash command that toggles a session-level behaviour without touching config.
3. plan_model agent config field (dual-model execution)
agents:
coder:
model: sonnet # execution model — fast, cheap, used every step
plan_model: opus # planning model — deep reasoning, used once per task
plan_mode: soft # off | soft | strict
When plan_model is set, the runtime switches to it for the submit_plan turn only, then reverts to model for all execution turns.
Why this matters:
- Planning is a reasoning task. Opus/o1-class models do it significantly better.
- Execution turns are mechanical. Sonnet is sufficient and ~5x cheaper.
- OpenAI's Operator uses exactly this pattern (o1-preview for planning, gpt-4o for execution).
- Cursor now exposes the same knob for subagents (May 2026).
- docker-agent already has this pattern:
Toolset.Model for per-toolset routing and model: frontmatter on skills. plan_model on AgentConfig is a natural extension.
plan_mode config values
| Value |
Behaviour |
off |
submit_plan tool available but not enforced. Model calls it voluntarily. (default) |
soft |
Runtime warns in TUI if a side-effecting tool is called without an active approved plan_id. Does not block. |
strict |
Runtime blocks write_file, edit_file, shell, create_directory, and MCP mutating tools until a plan is approved. Enforced at pre_tool_use. |
/plan in the TUI activates strict mode for the session regardless of config.
Affected Components
| Component |
Change |
pkg/tools/builtin/plan/ |
New submit_plan tool; session-scoped plan_id state |
pkg/tui/ |
/plan command registration; plan confirmation widget (Approve / Reject / Modify) |
pkg/runtime/ |
plan_mode enforcement at pre_tool_use; plan_model routing for planning turn |
pkg/config/latest/types.go |
PlanMode string + PlanModel string on AgentConfig |
agent-schema.json |
plan_mode and plan_model fields |
examples/plan-mode.yaml |
Working example |
Phasing
| Phase |
Scope |
Effort |
| 1 |
submit_plan tool: renders plan as formatted message block; no enforcement; model calls voluntarily |
effort:small |
| 2 |
/plan TUI toggle; proper confirmation widget (Approve/Reject/Modify); plan_id in session state |
effort:medium |
| 3 |
plan_mode: strict enforcement at pre_tool_use; plan_model routing for planning turn |
effort:large |
Phase 1 alone is already a meaningful improvement over instruction-only plan-first.
Alternatives Considered
permissions: ask: list — exists today; forces per-call confirmation (N prompts instead of 1 plan). High friction for multi-step tasks.
- Hook-based enforcement —
user_prompt_submit sets a token; pre_tool_use checks it. Stateful shell scripts, fragile, no UX.
- Instruction-only — current approach; works with capable models but degrades under long context and cannot be toggled.
- Skill-based plan template — a
/plan-first skill providing a structured format. Still a soft control; useful as a complement, not a replacement.
submit_plan + /plan toggle + plan_model is the only combination that gives good UX (one approval per plan, toggleable per-task), optional enforcement (strict mode), and cost-optimised dual-model execution.
Problem
Agentic workflows that span many tool calls (write files, run commands, call GitHub APIs) give users no control point between "I described what I want" and "the agent already did it."
The only safeguard today is an instruction asking agents to present a plan and wait — a soft control that works most of the time but has no runtime enforcement, and cannot be toggled at runtime without modifying config files.
Two concrete gaps:
/planin the TUI to enter a careful, plan-then-approve mode for a risky task — and then/plan offto return to fluid execution for a simple one.Prior Art
Every mature agentic system has converged on this pattern:
/multitaskslash command; per-subagent model setting (May 2026)plan_actionfunction call<thinking>block + text responseHumanNodeblocks graph traversalKey pattern across all: plan generation is separated from execution, the plan is shown to the user, and execution is gated on approval.
Cursor model-routing (verified, May 2026 changelog):
Anthropic extended thinking API (confirmed from SDK):
Proposed Solution
Three composable additions:
1.
submit_planbuilt-in toolAgent calls
submit_plan(title, steps, rationale)before any side-effecting action. The runtime renders the plan in the TUI, pauses, and waits for explicit user approval.{ "title": "Implement configurable retry policy", "steps": [ { "description": "Add RetryConfig struct to pkg/config/latest/types.go", "tool": "edit_file", "target": "pkg/config/latest/types.go", "expected_outcome": "RetryConfig embedded in ToolsetConfig" }, { "description": "task build / task test / task lint", "tool": "shell", "target": "task build && task test && task lint" } ], "rationale": "Follows existing sessions config pattern; backwards compat — defaults to no retry." }Tool output on approval:
{ "approved": true, "plan_id": "uuid" }On rejection with user note:
{ "approved": false, "reason": "Don't touch dispatcher.go yet, design first" }TUI rendering (proposed):
2.
/planTUI command (session-level toggle)When plan mode is active, the runtime sets an in-memory flag that enforces
plan_mode: strictfor the session — no config file change needed. Users decide per-task whether they want a careful review cycle or fluid execution.This follows the same pattern as Cursor's
/multitaskcommand: a slash command that toggles a session-level behaviour without touching config.3.
plan_modelagent config field (dual-model execution)When
plan_modelis set, the runtime switches to it for thesubmit_planturn only, then reverts tomodelfor all execution turns.Why this matters:
Toolset.Modelfor per-toolset routing andmodel:frontmatter on skills.plan_modelonAgentConfigis a natural extension.plan_modeconfig valuesoffsubmit_plantool available but not enforced. Model calls it voluntarily. (default)softplan_id. Does not block.strictwrite_file,edit_file,shell,create_directory, and MCP mutating tools until a plan is approved. Enforced atpre_tool_use./planin the TUI activatesstrictmode for the session regardless of config.Affected Components
pkg/tools/builtin/plan/submit_plantool; session-scopedplan_idstatepkg/tui//plancommand registration; plan confirmation widget (Approve / Reject / Modify)pkg/runtime/plan_modeenforcement atpre_tool_use;plan_modelrouting for planning turnpkg/config/latest/types.goPlanMode string+PlanModel stringonAgentConfigagent-schema.jsonplan_modeandplan_modelfieldsexamples/plan-mode.yamlPhasing
submit_plantool: renders plan as formatted message block; no enforcement; model calls voluntarilyeffort:small/planTUI toggle; proper confirmation widget (Approve/Reject/Modify);plan_idin session stateeffort:mediumplan_mode: strictenforcement atpre_tool_use;plan_modelrouting for planning turneffort:largePhase 1 alone is already a meaningful improvement over instruction-only plan-first.
Alternatives Considered
permissions: ask:list — exists today; forces per-call confirmation (N prompts instead of 1 plan). High friction for multi-step tasks.user_prompt_submitsets a token;pre_tool_usechecks it. Stateful shell scripts, fragile, no UX./plan-firstskill providing a structured format. Still a soft control; useful as a complement, not a replacement.submit_plan+/plantoggle +plan_modelis the only combination that gives good UX (one approval per plan, toggleable per-task), optional enforcement (strictmode), and cost-optimised dual-model execution.