Skip to content

feat: add session-scoped plan/build mode#3140

Open
trungutt wants to merge 2 commits into
docker:mainfrom
trungutt:plan-mode
Open

feat: add session-scoped plan/build mode#3140
trungutt wants to merge 2 commits into
docker:mainfrom
trungutt:plan-mode

Conversation

@trungutt

@trungutt trungutt commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Why

A planner sub-agent (e.g. the bundled coder agent's planner, pkg/config/builtin-agents/coder.yaml) puts planning under the agent's control. Plan mode puts it under the user's control: a session attribute the host (CLI, TUI, embedder) flips like a switch, and the model can't talk its way out of it.

That distinction matters because:

  • Works on any agent, third-party or hand-rolled — no YAML edit needed to add a planner sub-agent.
  • Hard tool removal, not prompt drift. coder's planner today still ships full filesystem (incl. write_file / edit_file); its read-only-ness is prompt-only. Plan mode strips every tool without Annotations.ReadOnlyHint from what the model sees.
  • Covers runtime-added MCP tools via the MCP-spec ReadOnlyHint annotation, with no YAML to update.
  • Mid-session toggle costs zero turns — one PATCH and the next turn observes it (vs. a transfer + synthesised user prompt for sub-agent handoff).
  • One primitive across hosts, instead of each host parsing agent topology or rolling its own deny-list.

Additive: planner sub-agents stay the right answer when planning belongs in the agent's design; plan mode is the user-driven counterpart.

What

New per-session Mode (build default, plan). In plan mode the runtime:

  1. Filters tools without Annotations.ReadOnlyHint from each turn's toolset (hard guarantee).
  2. Splices a per-turn <system-reminder> telling the model to plan, not act (so it doesn't waste turns trying tools that vanished).

Sub-sessions (transfer_task, run_skill, agent background-agent) inherit the parent's mode so the filter applies across the whole delegation tree. Harness-backed agents — which own their own toolset and can't be filtered from the runtime — refuse plan mode with code: "unsupported_mode" rather than degrade plan mode to advisory.

API:

POST   /api/sessions               { ..., "mode": "plan" }   # rejects unknown values with 400
GET    /api/sessions/:id                                      # + snapshot
PATCH  /api/sessions/:id/mode      { "mode": "plan" }

PATCH /mode updates the live in-memory session if a runtime is attached and persists to the store, so the next turn picks it up directly. Persistence is a new mode TEXT column (migration 22); pre-existing rows normalize to build on read. The field is guarded by s.mu via LoadMode / StoreMode accessors so concurrent runtime reads and HTTP writes can't race.

Filter piggybacks on the existing ExcludedTools filter site in runtime/loop.go; reminder rides the existing turn-start system-message splice.

Scope

In: runtime filter + reminder, sub-session mode propagation, harness refusal, API + DTOs, persistence + migration, user-facing API docs, unit + HTTP + race tests.

Out (left to hosts): TUI/leantui UX, CLI flags, "exit plan mode" tool — hosts can PATCH the endpoint directly.

Plan mode lets the runtime filter tools to read-only ones and inject a
per-turn system reminder, so the agent drafts a plan instead of taking
actions. Build mode is the default and preserves today's behaviour.

The mode is a per-session property, persisted alongside the rest of the
session, and exposed via:

  - POST   /api/sessions               { ..., mode }
  - GET    /api/sessions/:id           -> { ..., mode }
  - PATCH  /api/sessions/:id/mode      { mode }

Tool filtering is driven by the MCP-spec ReadOnlyHint annotation, so it
extends to user-added MCP tools without any per-tool config.
docker-agent

This comment was marked as low quality.

@aheritier aheritier added area/agent For work that has to do with the general agent loop/agentic features of the app area/api For features/issues/fixes related to the usage of the cagent API area/sessions For features/issues/fixes related to session lifecycle (resume, persistence, export) kind/feat PR adds a new feature (maps to feat: commit prefix) labels Jun 16, 2026
@rumpl

rumpl commented Jun 16, 2026

Copy link
Copy Markdown
Member

Question, why can't "plan mode" be implemented with a second agent that the user can select?

@trungutt

Copy link
Copy Markdown
Contributor Author

Question, why can't "plan mode" be implemented with a second agent that the user can select?

I updated the PR description

@rumpl

rumpl commented Jun 16, 2026

Copy link
Copy Markdown
Member

A planner sub-agent (e.g. the bundled coder agent's planner, pkg/config/builtin-agents/coder.yaml) puts planning under the agent's control.

This is not true, you can always ctrl+number to switch to an agent

@trungutt

Copy link
Copy Markdown
Contributor Author

@rumpl It could, but it scales poorly: every agent (built-in, community, user-authored) would need a hand-maintained "planner" twin kept in lockstep with its prompt, toolsets, hooks and sub-agents. Plan mode is effectively "every agent gets a free read-only twin", derived from each tool's MCP ReadOnlyHint — so user-added MCP servers and third-party agents work without anyone authoring a paired variant.

It also keeps you on the same agent under a different constraint, vs a sibling-agent switch (which has handoff cost: transfer event, possibly synthesised user prompt, separate conversation thread).

@trungutt trungutt requested a review from docker-agent June 16, 2026 13:29
@rumpl

rumpl commented Jun 16, 2026

Copy link
Copy Markdown
Member

Okay but... Not any agent can become a planner agent magically just by removing write tools. Its system prompt needs to be changed I would guess, also you are talking about ReadOnlyHint, but our builtin tools don't have it, do they?

@docker-agent

Copy link
Copy Markdown

PR Review Failed — The review agent encountered an error and could not complete the review. View logs.

@trungutt

trungutt commented Jun 16, 2026

Copy link
Copy Markdown
Contributor Author

Two points:

System prompt. It is changed per turn — planModeReminderMessages in pkg/runtime/plan_mode.go splices a <system-reminder> telling the model it is in plan mode, must not edit/shell/execute, and should draft a plan. Rides the same turn-start splice as turn_start hook output, so it picks up the cache_control marker too. That's the soft layer; the tool filter is the hard layer.

ReadOnlyHint on built-ins. They do have it — 33 occurrences across every read-side tool:

  • pkg/tools/builtin/filesystem/filesystem.go:425,449,462,475,487 — read_file, read_multiple_files, list_directory, directory_tree, search_files_content
  • pkg/tools/builtin/shell/shell.go:592,602 — list_background_jobs, view_background_job
  • pkg/tools/builtin/todo/todo.go:306–341 — all four todo tools
  • pkg/tools/builtin/memory/memory.go:152,175 — get_memories, search_memories
  • pkg/tools/builtin/tasks/tasks.go:575,607,617 — get_task, list_tasks, next_task
  • plus think.go:54, api.go:187, handoff.go:31, transfertask.go:34, mcpcatalog.go:423,435, modelpicker.go:74,84, rag.go:164, userprompt.go:65, skills.go:366,382, deferred.go:151,163, agent.go:480,490

Write-side tools (write_file, edit_file, shell, run_background_job, add_memory/update_memory/delete_memory, run_skill, change_model, lsp_rename/_code_actions/_format, run_background_agent) deliberately don't carry it — exactly what plan mode wants to strip. MCP tools get ReadOnlyHint from the spec itself, so user-added MCP servers and third-party tools are covered with no per-tool config.

On the sibling-agent alternative more broadly — beyond the points already in the description, it's painful operationally: every agent (built-in, community, user-authored) would need a hand-maintained planner twin kept in lockstep with its prompt, toolsets, hooks, and sub-agents. Two YAML surfaces per agent to keep in sync, every parent-agent change silently rots its planner sibling, and every third-party agent author has to do the same work. Plan mode collapses that to one knob across the fleet — including agents we don't author.

@trungutt trungutt marked this pull request as ready for review June 16, 2026 16:05
@trungutt trungutt requested a review from a team as a code owner June 16, 2026 16:05
@docker-agent

Copy link
Copy Markdown

PR Review Failed — The review agent encountered an error and could not complete the review. View logs.

@aheritier

Copy link
Copy Markdown
Contributor

It should cover most of #2788

@dgageot

dgageot commented Jun 19, 2026

Copy link
Copy Markdown
Member

Review — feat: add session-scoped plan/build mode

Overall this is clean and consistent with existing patterns — it builds, has good baseline test coverage, and the migration is backwards compatible. One blocking issue and a couple of concerns below.

🔴 Blocking: plan mode is bypassable via delegated sub-agents

Delegation tools (transfer_task, handoff) are marked read-only, so they survive plan-mode filtering. But delegated child sessions don't appear to inherit the parent's mode — they default back to build and get the full mutating toolset. So a plan-mode agent can't call write_file/shell directly, but it can delegate to a sub-agent that can, which undermines the core safety guarantee of plan mode.

Suggested fix: propagate the parent mode when creating sub-sessions (e.g. session.WithMode(parent.Mode)) in pkg/runtime/agent_delegation.go / pkg/runtime/loop.go, and add a regression test asserting that a plan-mode parent produces plan-mode children with mutating tools filtered out.

If sub-agents are intended to run in build mode, please document that explicitly and soften the reminder text so it doesn't claim state-changing tools are unavailable.

🟡 Harness agents get the reminder but not the hard filtering

In pkg/runtime/harness.go, plan mode injects the reminder but doesn't seem to apply the same tool filtering as the streaming loop. If the harness manages its own toolset, plan mode becomes prompt-only there, making "Only read-only tools have been made available" misleading. Either apply the filtering or adjust the text and document the limitation.

🟡 Possible data race on sess.Mode

UpdateSessionMode writes the mode under the manager mutex, but runtime reads for filtering/reminders don't appear to use the same lock. Consider guarding Mode (or exposing locked accessors) and running the relevant tests with -race.

⚪ Minor

  • Inconsistent invalid-mode handling: session creation normalizes invalid/empty mode → build, while the update endpoint rejects with 400. Prefer making these consistent (rejecting both).
  • Mid-turn mode changes apply on the next turn — matches the comments, not a bug, but worth documenting for API/UI callers.

Test coverage

Good for core plumbing (filtering, reminder, persistence, migration, HTTP, in-memory). Would like to see before merge: a delegation-inheritance test, and a harness-mode behavior test (or explicit documentation that plan mode is advisory there). An optional concurrency/race test around mode updates would be nice.

Recommendation

Request changes — the implementation is largely solid, but the delegated sub-agent bypass breaks the central guarantee of plan mode and should be fixed (or the guarantee redefined/documented) before merging. Looks close after that.

Five concerns from dgageot's review on the original commit, kept in a
single follow-up so the PR remains focused on shipping plan mode:

1. Sub-agent delegation bypass (blocking). Sub-sessions created via
   newSubSession (transfer_task, run_skill, run_background_agent)
   previously defaulted back to build mode. The delegation tools
   themselves are read-only and survive plan-mode filtering, so a
   plan-mode parent could delegate to a build-mode child with the
   full mutating toolset. Propagate parent.LoadMode() to the child
   so plan mode's hard filter applies across the whole delegation
   tree.

2. Harness gap. The harness path injected the plan-mode reminder but
   couldn't apply filterToolsForSession — the harness owns its
   toolset. The reminder text claimed 'tools have been filtered',
   making plan mode misleading there. Refuse plan mode for
   harness-backed agents with a new ErrorCodeUnsupportedMode so the
   guarantee stays a guarantee.

3. Data race on sess.Mode. UpdateSessionMode wrote Mode from the HTTP
   goroutine while the runtime read it (filterToolsForSession,
   planModeReminderMessages) without synchronisation. Add
   LoadMode/StoreMode accessors on *Session backed by s.mu
   (modelled on the existing Usage/SetUsage pair) and route every
   concurrent read/write through them. A race-detector test in
   TestSession_ModeAccessors pins the guarantee.

4. Inconsistent invalid-mode handling. POST /api/sessions silently
   coerced unknown modes to build while PATCH /api/sessions/:id/mode
   rejected them with 400. Add the same IsValid() gate to
   createSession so both endpoints behave the same way.

5. Missing API docs. Add a row for PATCH /api/sessions/:id/mode in
   the Sessions endpoint table at docs/features/api-server/index.md
   and a new 'Plan mode' section covering build/plan semantics,
   create- vs update-time entry points, the next-turn application
   rule, sub-session inheritance, and the harness-unsupported
   behaviour.

Verb naming on the new accessors (Load/Store rather than Mode/SetMode)
avoids a method-vs-field collision and matches sync/atomic.Value's
convention.
@aheritier aheritier added the area/runtime Runtime engine, agent loop execution, tool dispatch, loop detection label Jun 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/agent For work that has to do with the general agent loop/agentic features of the app area/api For features/issues/fixes related to the usage of the cagent API area/runtime Runtime engine, agent loop execution, tool dispatch, loop detection area/sessions For features/issues/fixes related to session lifecycle (resume, persistence, export) kind/feat PR adds a new feature (maps to feat: commit prefix)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants