feat: add session-scoped plan/build mode by trungutt · Pull Request #3140 · docker/docker-agent

trungutt · 2026-06-16T09:20:23Z

Why

A planner sub-agent (e.g. the bundled coder agent's planner, pkg/config/builtin-agents/coder.yaml) puts planning under the agent's control. Plan mode puts it under the user's control: a session attribute the host (CLI, TUI, embedder) flips like a switch, and the model can't talk its way out of it.

That distinction matters because:

Works on any agent, third-party or hand-rolled — no YAML edit needed to add a planner sub-agent.
Hard tool removal, not prompt drift. coder's planner today still ships full filesystem (incl. write_file / edit_file); its read-only-ness is prompt-only. Plan mode strips every tool without Annotations.ReadOnlyHint from what the model sees.
Covers runtime-added MCP tools via the MCP-spec ReadOnlyHint annotation, with no YAML to update.
Mid-session toggle costs zero turns — one PATCH and the next turn observes it (vs. a transfer + synthesised user prompt for sub-agent handoff).
One primitive across hosts, instead of each host parsing agent topology or rolling its own deny-list.

Additive: planner sub-agents stay the right answer when planning belongs in the agent's design; plan mode is the user-driven counterpart.

What

New per-session Mode (build default, plan). In plan mode the runtime:

Filters tools without Annotations.ReadOnlyHint from each turn's toolset (hard guarantee).
Splices a per-turn <system-reminder> telling the model to plan, not act (so it doesn't waste turns trying tools that vanished).

Sub-sessions (transfer_task, run_skill, agent background-agent) inherit the parent's mode so the filter applies across the whole delegation tree. Harness-backed agents — which own their own toolset and can't be filtered from the runtime — refuse plan mode with code: "unsupported_mode" rather than degrade plan mode to advisory.

API:

POST   /api/sessions               { ..., "mode": "plan" }   # rejects unknown values with 400
GET    /api/sessions/:id                                      # + snapshot
PATCH  /api/sessions/:id/mode      { "mode": "plan" }

PATCH /mode updates the live in-memory session if a runtime is attached and persists to the store, so the next turn picks it up directly. Persistence is a new mode TEXT column (migration 22); pre-existing rows normalize to build on read. The field is guarded by s.mu via LoadMode / StoreMode accessors so concurrent runtime reads and HTTP writes can't race.

Filter piggybacks on the existing ExcludedTools filter site in runtime/loop.go; reminder rides the existing turn-start system-message splice.

Scope

In: runtime filter + reminder, sub-session mode propagation, harness refusal, API + DTOs, persistence + migration, user-facing API docs, unit + HTTP + race tests.

Out (left to hosts): TUI/leantui UX, CLI flags, "exit plan mode" tool — hosts can PATCH the endpoint directly.

Plan mode lets the runtime filter tools to read-only ones and inject a per-turn system reminder, so the agent drafts a plan instead of taking actions. Build mode is the default and preserves today's behaviour. The mode is a per-session property, persisted alongside the rest of the session, and exposed via: - POST /api/sessions { ..., mode } - GET /api/sessions/:id -> { ..., mode } - PATCH /api/sessions/:id/mode { mode } Tool filtering is driven by the MCP-spec ReadOnlyHint annotation, so it extends to user-added MCP tools without any per-tool config.

rumpl · 2026-06-16T10:40:06Z

Question, why can't "plan mode" be implemented with a second agent that the user can select?

trungutt · 2026-06-16T11:11:58Z

Question, why can't "plan mode" be implemented with a second agent that the user can select?

I updated the PR description

rumpl · 2026-06-16T12:04:27Z

A planner sub-agent (e.g. the bundled coder agent's planner, pkg/config/builtin-agents/coder.yaml) puts planning under the agent's control.

This is not true, you can always ctrl+number to switch to an agent

trungutt · 2026-06-16T13:08:57Z

@rumpl It could, but it scales poorly: every agent (built-in, community, user-authored) would need a hand-maintained "planner" twin kept in lockstep with its prompt, toolsets, hooks and sub-agents. Plan mode is effectively "every agent gets a free read-only twin", derived from each tool's MCP ReadOnlyHint — so user-added MCP servers and third-party agents work without anyone authoring a paired variant.

It also keeps you on the same agent under a different constraint, vs a sibling-agent switch (which has handoff cost: transfer event, possibly synthesised user prompt, separate conversation thread).

rumpl · 2026-06-16T13:32:44Z

Okay but... Not any agent can become a planner agent magically just by removing write tools. Its system prompt needs to be changed I would guess, also you are talking about ReadOnlyHint, but our builtin tools don't have it, do they?

docker-agent · 2026-06-16T14:00:52Z

❌ PR Review Failed — The review agent encountered an error and could not complete the review. View logs.

trungutt · 2026-06-16T16:02:52Z

Two points:

System prompt. It is changed per turn — planModeReminderMessages in pkg/runtime/plan_mode.go splices a <system-reminder> telling the model it is in plan mode, must not edit/shell/execute, and should draft a plan. Rides the same turn-start splice as turn_start hook output, so it picks up the cache_control marker too. That's the soft layer; the tool filter is the hard layer.

ReadOnlyHint on built-ins. They do have it — 33 occurrences across every read-side tool:

pkg/tools/builtin/filesystem/filesystem.go:425,449,462,475,487 — read_file, read_multiple_files, list_directory, directory_tree, search_files_content
pkg/tools/builtin/shell/shell.go:592,602 — list_background_jobs, view_background_job
pkg/tools/builtin/todo/todo.go:306–341 — all four todo tools
pkg/tools/builtin/memory/memory.go:152,175 — get_memories, search_memories
pkg/tools/builtin/tasks/tasks.go:575,607,617 — get_task, list_tasks, next_task
plus think.go:54, api.go:187, handoff.go:31, transfertask.go:34, mcpcatalog.go:423,435, modelpicker.go:74,84, rag.go:164, userprompt.go:65, skills.go:366,382, deferred.go:151,163, agent.go:480,490

Write-side tools (write_file, edit_file, shell, run_background_job, add_memory/update_memory/delete_memory, run_skill, change_model, lsp_rename/_code_actions/_format, run_background_agent) deliberately don't carry it — exactly what plan mode wants to strip. MCP tools get ReadOnlyHint from the spec itself, so user-added MCP servers and third-party tools are covered with no per-tool config.

On the sibling-agent alternative more broadly — beyond the points already in the description, it's painful operationally: every agent (built-in, community, user-authored) would need a hand-maintained planner twin kept in lockstep with its prompt, toolsets, hooks, and sub-agents. Two YAML surfaces per agent to keep in sync, every parent-agent change silently rots its planner sibling, and every third-party agent author has to do the same work. Plan mode collapses that to one knob across the fleet — including agents we don't author.

docker-agent · 2026-06-16T16:36:33Z

❌ PR Review Failed — The review agent encountered an error and could not complete the review. View logs.

aheritier · 2026-06-17T06:24:57Z

It should cover most of #2788

dgageot · 2026-06-19T16:05:15Z

Review — `feat: add session-scoped plan/build mode`

Overall this is clean and consistent with existing patterns — it builds, has good baseline test coverage, and the migration is backwards compatible. One blocking issue and a couple of concerns below.

🔴 Blocking: plan mode is bypassable via delegated sub-agents

Delegation tools (transfer_task, handoff) are marked read-only, so they survive plan-mode filtering. But delegated child sessions don't appear to inherit the parent's mode — they default back to build and get the full mutating toolset. So a plan-mode agent can't call write_file/shell directly, but it can delegate to a sub-agent that can, which undermines the core safety guarantee of plan mode.

Suggested fix: propagate the parent mode when creating sub-sessions (e.g. session.WithMode(parent.Mode)) in pkg/runtime/agent_delegation.go / pkg/runtime/loop.go, and add a regression test asserting that a plan-mode parent produces plan-mode children with mutating tools filtered out.

If sub-agents are intended to run in build mode, please document that explicitly and soften the reminder text so it doesn't claim state-changing tools are unavailable.

🟡 Harness agents get the reminder but not the hard filtering

In pkg/runtime/harness.go, plan mode injects the reminder but doesn't seem to apply the same tool filtering as the streaming loop. If the harness manages its own toolset, plan mode becomes prompt-only there, making "Only read-only tools have been made available" misleading. Either apply the filtering or adjust the text and document the limitation.

🟡 Possible data race on `sess.Mode`

UpdateSessionMode writes the mode under the manager mutex, but runtime reads for filtering/reminders don't appear to use the same lock. Consider guarding Mode (or exposing locked accessors) and running the relevant tests with -race.

⚪ Minor

Inconsistent invalid-mode handling: session creation normalizes invalid/empty mode → build, while the update endpoint rejects with 400. Prefer making these consistent (rejecting both).
Mid-turn mode changes apply on the next turn — matches the comments, not a bug, but worth documenting for API/UI callers.

Test coverage

Good for core plumbing (filtering, reminder, persistence, migration, HTTP, in-memory). Would like to see before merge: a delegation-inheritance test, and a harness-mode behavior test (or explicit documentation that plan mode is advisory there). An optional concurrency/race test around mode updates would be nice.

Recommendation

Request changes — the implementation is largely solid, but the delegated sub-agent bypass breaks the central guarantee of plan mode and should be fixed (or the guarantee redefined/documented) before merging. Looks close after that.

Five concerns from dgageot's review on the original commit, kept in a single follow-up so the PR remains focused on shipping plan mode: 1. Sub-agent delegation bypass (blocking). Sub-sessions created via newSubSession (transfer_task, run_skill, run_background_agent) previously defaulted back to build mode. The delegation tools themselves are read-only and survive plan-mode filtering, so a plan-mode parent could delegate to a build-mode child with the full mutating toolset. Propagate parent.LoadMode() to the child so plan mode's hard filter applies across the whole delegation tree. 2. Harness gap. The harness path injected the plan-mode reminder but couldn't apply filterToolsForSession — the harness owns its toolset. The reminder text claimed 'tools have been filtered', making plan mode misleading there. Refuse plan mode for harness-backed agents with a new ErrorCodeUnsupportedMode so the guarantee stays a guarantee. 3. Data race on sess.Mode. UpdateSessionMode wrote Mode from the HTTP goroutine while the runtime read it (filterToolsForSession, planModeReminderMessages) without synchronisation. Add LoadMode/StoreMode accessors on *Session backed by s.mu (modelled on the existing Usage/SetUsage pair) and route every concurrent read/write through them. A race-detector test in TestSession_ModeAccessors pins the guarantee. 4. Inconsistent invalid-mode handling. POST /api/sessions silently coerced unknown modes to build while PATCH /api/sessions/:id/mode rejected them with 400. Add the same IsValid() gate to createSession so both endpoints behave the same way. 5. Missing API docs. Add a row for PATCH /api/sessions/:id/mode in the Sessions endpoint table at docs/features/api-server/index.md and a new 'Plan mode' section covering build/plan semantics, create- vs update-time entry points, the next-turn application rule, sub-session inheritance, and the harness-unsupported behaviour. Verb naming on the new accessors (Load/Store rather than Mode/SetMode) avoids a method-vs-field collision and matches sync/atomic.Value's convention.

This comment was marked as low quality.

Sign in to view

trungutt requested a review from docker-agent June 16, 2026 13:29

trungutt marked this pull request as ready for review June 16, 2026 16:05

trungutt requested a review from a team as a code owner June 16, 2026 16:05

This was referenced Jun 19, 2026

feat: let agents declare a plan_persona instruction #3168

Closed

feat: let agents declare a plan_persona instruction trungutt/docker-agent#3

Draft

aheritier added the area/runtime Runtime engine, agent loop execution, tool dispatch, loop detection label Jun 22, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add session-scoped plan/build mode#3140

feat: add session-scoped plan/build mode#3140
trungutt wants to merge 2 commits into
docker:mainfrom
trungutt:plan-mode

trungutt commented Jun 16, 2026 •

edited

Loading

Uh oh!

This comment was marked as low quality.

Uh oh!

rumpl commented Jun 16, 2026

Uh oh!

trungutt commented Jun 16, 2026

Uh oh!

rumpl commented Jun 16, 2026

Uh oh!

trungutt commented Jun 16, 2026

Uh oh!

rumpl commented Jun 16, 2026

Uh oh!

docker-agent commented Jun 16, 2026

Uh oh!

trungutt commented Jun 16, 2026 •

edited

Loading

Uh oh!

docker-agent commented Jun 16, 2026

Uh oh!

aheritier commented Jun 17, 2026

Uh oh!

dgageot commented Jun 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

trungutt commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why

What

Scope

Uh oh!

This comment was marked as low quality.

Uh oh!

rumpl commented Jun 16, 2026

Uh oh!

trungutt commented Jun 16, 2026

Uh oh!

rumpl commented Jun 16, 2026

Uh oh!

trungutt commented Jun 16, 2026

Uh oh!

rumpl commented Jun 16, 2026

Uh oh!

docker-agent commented Jun 16, 2026

Uh oh!

trungutt commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

docker-agent commented Jun 16, 2026

Uh oh!

aheritier commented Jun 17, 2026

Uh oh!

dgageot commented Jun 19, 2026

Review — feat: add session-scoped plan/build mode

🔴 Blocking: plan mode is bypassable via delegated sub-agents

🟡 Harness agents get the reminder but not the hard filtering

🟡 Possible data race on sess.Mode

⚪ Minor

Test coverage

Recommendation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

trungutt commented Jun 16, 2026 •

edited

Loading

trungutt commented Jun 16, 2026 •

edited

Loading

Review — `feat: add session-scoped plan/build mode`

🟡 Possible data race on `sess.Mode`