feat: plan mode — /plan toggle, submit_plan tool, and plan_model for dual-model execution

## Problem

Agentic workflows that span many tool calls (write files, run commands, call GitHub APIs) give users no control point between "I described what I want" and "the agent already did it."

The only safeguard today is an instruction asking agents to present a plan and wait — a soft control that works most of the time but has no runtime enforcement, and cannot be toggled at runtime without modifying config files.

Two concrete gaps:

1. **No user-controlled mode switch.** There is no way to type `/plan` in the TUI to enter a careful, plan-then-approve mode for a risky task — and then `/plan off` to return to fluid execution for a simple one.
2. **No cost-optimised planning.** Planning requires deep reasoning (costly model, used once). Execution is mechanical (cheaper model, used N times). There is no way to route the planning turn to a different model than the execution turns.

---

## Prior Art

Every mature agentic system has converged on this pattern:

| System | Plan format | User toggle | Dual-model? |
|---|---|---|---|
| **Cursor** | Structured file list with diff preview | Ask vs Agent mode; `/multitask` slash command; per-subagent model setting (May 2026) | Yes — subagent model configurable separately from parent |
| **GitHub Copilot Workspace** | JSON `plan_action` function call | Phase gate: Task → Plan → Code; natural-language modify | Yes — o1-preview for planning, gpt-4o for execution |
| **OpenAI Operator** | JSON schema | Structured approval gate in UI | Yes — reasoning model for plan, fast model for execution |
| **Devin** | JSON step tree | Checkpoint-based pauses; user steers mid-execution | Implicit (planning phase uses extended thinking budget) |
| **Claude extended thinking** | `<thinking>` block + text response | Application-layer convention (no built-in gate) | Single model, separate token budget for reasoning |
| **LangGraph** | Free-form or JSON | `HumanNode` blocks graph traversal | Framework-dependent |

**Key pattern across all:** plan generation is separated from execution, the plan is shown to the user, and execution is gated on approval.

**Cursor model-routing (verified, May 2026 changelog):**
> "Added the ability to control Explore subagent behavior from settings: choose a specific model for Explore subagents to run on, inherit the same model as the parent agent, or disable Explore subagents altogether."
> "Added support for general model names for subagent configuration."
> "`/multitask` is now available in the editor for running async subagents."

**Anthropic extended thinking API (confirmed from SDK):**
```python
response = client.messages.create(
    model="claude-sonnet-4-5",
    thinking={"type": "enabled", "budget_tokens": 1600},
    ...
)
# Response contains separate `thinking` block and `text` block.
# The approval gate is the application's responsibility — no built-in pause.
```

---

## Proposed Solution

Three composable additions:

### 1. `submit_plan` built-in tool

Agent calls `submit_plan(title, steps, rationale)` before any side-effecting action. The runtime renders the plan in the TUI, pauses, and waits for explicit user approval.

```json
{
  "title": "Implement configurable retry policy",
  "steps": [
    {
      "description": "Add RetryConfig struct to pkg/config/latest/types.go",
      "tool": "edit_file",
      "target": "pkg/config/latest/types.go",
      "expected_outcome": "RetryConfig embedded in ToolsetConfig"
    },
    {
      "description": "task build / task test / task lint",
      "tool": "shell",
      "target": "task build && task test && task lint"
    }
  ],
  "rationale": "Follows existing sessions config pattern; backwards compat — defaults to no retry."
}
```

Tool output on approval:
```json
{ "approved": true, "plan_id": "uuid" }
```

On rejection with user note:
```json
{ "approved": false, "reason": "Don't touch dispatcher.go yet, design first" }
```

**TUI rendering (proposed):**
```
╭─ Plan: "Implement configurable retry policy" ──────────────────────╮
│  1. Edit pkg/config/latest/types.go → add RetryConfig struct       │
│  2. Edit pkg/runtime/dispatcher.go  → read RetryConfig per toolset │
│  3. Update agent-schema.json                                        │
│  4. task build / task test / task lint                              │
│                                                                     │
│  Rationale: follows sessions config pattern; backwards compat.      │
│                                                                     │
│  [Approve ↵]   [Reject esc]   [Modify: type changes ...]           │
╰─────────────────────────────────────────────────────────────────────╯
```

### 2. `/plan` TUI command (session-level toggle)

```
/plan           → enter plan mode for this session
/plan off       → exit plan mode; return to normal execution
/plan status    → show current mode
```

When plan mode is active, the runtime sets an in-memory flag that enforces `plan_mode: strict` for the session — no config file change needed. Users decide per-task whether they want a careful review cycle or fluid execution.

This follows the same pattern as Cursor's `/multitask` command: a slash command that toggles a session-level behaviour without touching config.

### 3. `plan_model` agent config field (dual-model execution)

```yaml
agents:
  coder:
    model: sonnet       # execution model — fast, cheap, used every step
    plan_model: opus    # planning model — deep reasoning, used once per task
    plan_mode: soft     # off | soft | strict
```

When `plan_model` is set, the runtime switches to it for the `submit_plan` turn only, then reverts to `model` for all execution turns.

**Why this matters:**
- Planning is a reasoning task. Opus/o1-class models do it significantly better.
- Execution turns are mechanical. Sonnet is sufficient and ~5x cheaper.
- OpenAI's Operator uses exactly this pattern (o1-preview for planning, gpt-4o for execution).
- Cursor now exposes the same knob for subagents (May 2026).
- docker-agent already has this pattern: `Toolset.Model` for per-toolset routing and `model:` frontmatter on skills. `plan_model` on `AgentConfig` is a natural extension.

---

## `plan_mode` config values

| Value | Behaviour |
|---|---|
| `off` | `submit_plan` tool available but not enforced. Model calls it voluntarily. (default) |
| `soft` | Runtime warns in TUI if a side-effecting tool is called without an active approved `plan_id`. Does not block. |
| `strict` | Runtime blocks `write_file`, `edit_file`, `shell`, `create_directory`, and MCP mutating tools until a plan is approved. Enforced at `pre_tool_use`. |

`/plan` in the TUI activates `strict` mode for the session regardless of config.

---

## Affected Components

| Component | Change |
|---|---|
| `pkg/tools/builtin/plan/` | New `submit_plan` tool; session-scoped `plan_id` state |
| `pkg/tui/` | `/plan` command registration; plan confirmation widget (Approve / Reject / Modify) |
| `pkg/runtime/` | `plan_mode` enforcement at `pre_tool_use`; `plan_model` routing for planning turn |
| `pkg/config/latest/types.go` | `PlanMode string` + `PlanModel string` on `AgentConfig` |
| `agent-schema.json` | `plan_mode` and `plan_model` fields |
| `examples/plan-mode.yaml` | Working example |

---

## Phasing

| Phase | Scope | Effort |
|---|---|---|
| 1 | `submit_plan` tool: renders plan as formatted message block; no enforcement; model calls voluntarily | `effort:small` |
| 2 | `/plan` TUI toggle; proper confirmation widget (Approve/Reject/Modify); `plan_id` in session state | `effort:medium` |
| 3 | `plan_mode: strict` enforcement at `pre_tool_use`; `plan_model` routing for planning turn | `effort:large` |

Phase 1 alone is already a meaningful improvement over instruction-only plan-first.

---

## Alternatives Considered

- **`permissions: ask:` list** — exists today; forces per-call confirmation (N prompts instead of 1 plan). High friction for multi-step tasks.
- **Hook-based enforcement** — `user_prompt_submit` sets a token; `pre_tool_use` checks it. Stateful shell scripts, fragile, no UX.
- **Instruction-only** — current approach; works with capable models but degrades under long context and cannot be toggled.
- **Skill-based plan template** — a `/plan-first` skill providing a structured format. Still a soft control; useful as a complement, not a replacement.

`submit_plan` + `/plan` toggle + `plan_model` is the only combination that gives good UX (one approval per plan, toggleable per-task), optional enforcement (`strict` mode), and cost-optimised dual-model execution.


Value	Behaviour
`off`	`submit_plan` tool available but not enforced. Model calls it voluntarily. (default)
`soft`	Runtime warns in TUI if a side-effecting tool is called without an active approved `plan_id`. Does not block.
`strict`	Runtime blocks `write_file`, `edit_file`, `shell`, `create_directory`, and MCP mutating tools until a plan is approved. Enforced at `pre_tool_use`.

Component	Change
`pkg/tools/builtin/plan/`	New `submit_plan` tool; session-scoped `plan_id` state
`pkg/tui/`	`/plan` command registration; plan confirmation widget (Approve / Reject / Modify)
`pkg/runtime/`	`plan_mode` enforcement at `pre_tool_use`; `plan_model` routing for planning turn
`pkg/config/latest/types.go`	`PlanMode string` + `PlanModel string` on `AgentConfig`
`agent-schema.json`	`plan_mode` and `plan_model` fields
`examples/plan-mode.yaml`	Working example

Phase	Scope	Effort
1	`submit_plan` tool: renders plan as formatted message block; no enforcement; model calls voluntarily	`effort:small`
2	`/plan` TUI toggle; proper confirmation widget (Approve/Reject/Modify); `plan_id` in session state	`effort:medium`
3	`plan_mode: strict` enforcement at `pre_tool_use`; `plan_model` routing for planning turn	`effort:large`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: plan mode — /plan toggle, submit_plan tool, and plan_model for dual-model execution #2788

Problem

Prior Art

Proposed Solution

1. `submit_plan` built-in tool

2. `/plan` TUI command (session-level toggle)

3. `plan_model` agent config field (dual-model execution)

`plan_mode` config values

Affected Components

Phasing

Alternatives Considered

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

System	Plan format	User toggle	Dual-model?
Cursor	Structured file list with diff preview	Ask vs Agent mode; `/multitask` slash command; per-subagent model setting (May 2026)	Yes — subagent model configurable separately from parent
GitHub Copilot Workspace	JSON `plan_action` function call	Phase gate: Task → Plan → Code; natural-language modify	Yes — o1-preview for planning, gpt-4o for execution
OpenAI Operator	JSON schema	Structured approval gate in UI	Yes — reasoning model for plan, fast model for execution
Devin	JSON step tree	Checkpoint-based pauses; user steers mid-execution	Implicit (planning phase uses extended thinking budget)
Claude extended thinking	`<thinking>` block + text response	Application-layer convention (no built-in gate)	Single model, separate token budget for reasoning
LangGraph	Free-form or JSON	`HumanNode` blocks graph traversal	Framework-dependent

feat: plan mode — /plan toggle, submit_plan tool, and plan_model for dual-model execution #2788

Description

Problem

Prior Art

Proposed Solution

1. submit_plan built-in tool

2. /plan TUI command (session-level toggle)

3. plan_model agent config field (dual-model execution)

plan_mode config values

Affected Components

Phasing

Alternatives Considered

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

1. `submit_plan` built-in tool

2. `/plan` TUI command (session-level toggle)

3. `plan_model` agent config field (dual-model execution)

`plan_mode` config values