feat(agent): deliver resolved tools through the runner by mmabrouk · Pull Request #4765 · Agenta-AI/agenta

mmabrouk · 2026-06-19T15:40:44Z

This PR is part of a stack. Review bottom-up.

Each PR's diff is only its own delta. Merge from the bottom. This PR's base is #4764 (merge that first).

feat(agent): Pi-backed agent workflow service, template, and tracing #4758: Pi-backed agent workflow service, template, tracing
feat(agent): runnable tools as agent configuration #4759: runnable tools as agent configuration
feat(agent): drive harnesses over ACP via the rivet sandbox-agent #4760: drive harnesses over ACP via rivet
feat(agent): move the agent runtime into the SDK behind backend/harness ports #4761: move the runtime into the SDK behind backend/harness ports
- chore(agent): docker cleanups for the sandbox-agent sidecar #4762: docker cleanups for the sidecar (side branch off feat(agent): move the agent runtime into the SDK behind backend/harness ports #4761)
feat(sdk): typed agent tool resolution contracts #4763: typed SDK agent tool contracts
feat(agent): resolve typed tools through the service #4764: resolve typed tools through the service
feat(agent): deliver resolved tools through the runner #4765: deliver resolved tools through the runner <- you are here
refactor(agent): remove Vercel adapter dead aliases #4766: remove Vercel adapter dead aliases
fix(agent): propagate messages session ids to runner traces #4767: propagate messages session ids to runner traces
feat(agent): route load-session through a no-op session store port #4768: load-session via a no-op session store port
feat(sdk): advertise Vercel messages protocol headers #4769: advertise Vercel messages protocol headers
docs(agent): agent-workflows ground truth + comment hygiene #4770: agent-workflows ground truth + comment hygiene

Context

The base branch feat/sdk-local-tools-service resolves a tool in the SDK and composes the resolved spec into the run request. The runner still treated every tool as one shape: POST the call back to Agenta's /tools/call. This PR is slice #9 of docs/design/agent-workflows/pr-stack.md (tool runtime). It makes the TypeScript runner execute a resolved tool by its kind, so a tool can run locally instead of always routing back to the server.

What this changes

A resolved tool now carries an executor kind. The runner branches on it:

callback (the default, and the only old behavior): POST back through Agenta's /tools/call, so the Composio key and connection auth stay server-side.
code: run the tool's snippet in a subprocess with a scoped secret env. No round trip to the server.
client: browser-fulfilled across a turn boundary, so the in-sandbox paths skip it.

Before, each delivery path (in-process Pi, Pi-under-rivet, the MCP bridge) carried its own copy of the "POST the call back" logic, and the Daytona file relay lived inside extensions/agenta.ts. After, one tools/dispatch.ts owns the branch-on-kind decision and the relay; each call site keeps only its own result wrapping. The client.ts transport is renamed to callback.ts to name what it is now (one executor among three, not the whole tool client).

protocol.ts grows the spec from one axis to three orthogonal ones: kind (executor), needsApproval (human gate), and render (generative-UI hint). It also adds interaction_request events and an McpServerConfig so the wire can carry those later.

On the Python side, the 491-line ui_messages.py egress splits into an agents/adapters/vercel/ package (messages, routing, sse, stream). The old module keeps thin re-exports for back-compat. The /messages route now selects its wire format by endpoint (vercel), not by the Accept header, because a Vercel UI message stream and a plain SSE stream share the text/event-stream media type.

Key architectural decision to review

The most important file is services/agent/src/tools/code.ts. A code tool runs author-supplied code in the same sandbox where the harness runs, so its env is the security boundary. The child process does NOT inherit the sidecar's process.env. It gets a minimal startup allowlist (PATH, HOME, locale, temp, Windows essentials) plus only the tool's own scoped secrets. This matters because the in-process Pi path writes provider keys like OPENAI_API_KEY into process.env before a run, and AGENTA_* / COMPOSIO_* / DAYTONA_* config lives there too. An allowlist that leaks any of those would hand a snippet the platform's keys. Scrutinize BASE_ENV_ALLOWLIST and buildChildEnv for anything secret-bearing, and confirm the timeout/abort path always SIGKILLs the child.

The second decision is the Responder seam in services/agent/src/responder.ts. The rivet permission gate was a hardcoded auto-approve. This lifts it behind an interface so a cross-turn HITL responder can slot in later without touching the harness adapter. PolicyResponder reproduces the old behavior exactly, including the AGENTA_RIVET_DENY_PERMISSIONS precedence. Check that the default stays auto-allow and that decisionToReply maps onto the ACP replies the harness actually offers.

How to review this PR

Read in this order:

services/agent/src/protocol.ts — the three-axis ResolvedToolSpec, RenderHint, and the new event variants. This is the contract everything downstream branches on.
services/agent/src/tools/dispatch.ts — runResolvedTool, the single branch-on-kind. Confirm client throws and callback chooses relay vs direct POST by relayDir.
services/agent/src/tools/code.ts — the sandbox env boundary (see above).
The three call sites that now delegate: engines/pi.ts buildCustomTools, extensions/agenta.ts registerTools, tools/mcp-server.ts. Check each still wraps results its own way and skips client tools.

Skip the docs/design/agent-workflows/sdk-local-tools/ tree (design notes and a review log, not shipped behavior) and the services/agent/README.md one-word rename.

Likely regression: a tool with no kind set must still behave exactly like the old callback tool. Verify the undefined -> callback fallback in runResolvedTool and in every call site's branch.

Tests / notes

New TypeScript tests cover the code-tool executor, dispatch routing, the MCP server, the responder, and continuation. The MCP bridge tool id moved from Date.now() to randomUUID() so two calls in the same millisecond no longer collide.

vercel · 2026-06-19T15:40:51Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
agenta-documentation	Ready	Preview, Comment	Jun 19, 2026 3:40pm

coderabbitai · 2026-06-19T15:40:54Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: ffe79f17-6772-4a32-8c3d-67b9b8239860

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/sdk-local-tools-runner-docs

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

mmabrouk · 2026-06-19T15:58:37Z

Reviewer guide: interesting code

services/agent/src/tools/dispatch.ts:104 — runResolvedTool is the single branch-on-kind dispatch; code runs locally, client throws, callback picks relay vs direct POST.
services/agent/src/tools/code.ts:82 — BASE_ENV_ALLOWLIST plus buildChildEnv is the security boundary; the snippet sees only this allowlist plus its scoped secrets, never the sidecar's process.env.
services/agent/src/protocol.ts:64 — ResolvedToolSpec gains three orthogonal axes (kind, needsApproval, render); callRef is now optional and undefined kind means callback.
services/agent/src/responder.ts:48 — PolicyResponder lifts the rivet auto-approve behind a seam so a cross-turn HITL responder can slot in without touching the harness.
services/agent/src/tools/mcp-server.ts:85 — the MCP bridge tool id moves from Date.now() to randomUUID() so parallel same-millisecond calls no longer collide.
sdks/python/agenta/sdk/agents/adapters/vercel/__init__.py:1 — the 491-line ui_messages.py egress splits into this package; the /messages route selects wire format by endpoint, not the Accept header.

mmabrouk · 2026-06-19T15:58:49Z

+];
+
+/** Build the child env from a minimal allowlist (copied only when set) plus scoped secrets. */
+function buildChildEnv(


This is the boundary that keeps a code tool from seeing platform secrets: the child gets only this allowlist plus its own scoped env, not the sidecar's process.env (where the in-process Pi path writes provider keys). Confirm nothing secret-bearing creeps into BASE_ENV_ALLOWLIST.

mmabrouk · 2026-06-19T15:58:50Z

+    return runCodeTool(spec.runtime, spec.code ?? "", spec.env, params, opts.signal);
+  }
+  if (spec.kind === "client") {
+    throw new Error(


Single source of truth for branch-on-kind. A spec with no kind falls through to the callback path here, which preserves the old behavior exactly; please confirm every call site relies on that same default.

mmabrouk · 2026-06-19T15:58:51Z

+      // Agenta's /tools/call. A unique id per call so two parallel calls in the same
+      // millisecond don't collide (Date.now() would).
+      const text = await runResolvedTool(spec, params?.arguments, {
+        toolCallId: randomUUID(),


Real fix: the old id was tool-${Date.now()}, so two parallel calls in the same millisecond shared a relay filename / call id. randomUUID() removes the collision.

mmabrouk · 2026-06-19T15:58:51Z

+export function policyFromRequest(permissionPolicy?: string): PermissionPolicy {
+  if (permissionPolicy === "deny" || process.env.AGENTA_RIVET_DENY_PERMISSIONS === "true") {
+    return "deny";
+  }


policyFromRequest keeps the prior precedence: explicit per-run deny or AGENTA_RIVET_DENY_PERMISSIONS flips to deny, otherwise auto-allow. The default must stay auto so headless /invoke runs are unchanged.

mmabrouk · 2026-06-19T16:29:49Z

Superseded. Replacing the path-based stack with PRs sliced by functional area showing final code only, so reviewers don't comment on intermediate scaffolding that a later PR rewrites. See the new set.

feat(agent): deliver resolved tools through the runner

0774337

dosubot Bot added size:XXL This PR changes 1000+ lines, ignoring generated files. Backend Feature Request New feature or request SDK labels Jun 19, 2026

mmabrouk mentioned this pull request Jun 19, 2026

feat(agent): runnable tools as agent configuration #4759

Closed

mmabrouk commented Jun 19, 2026

View reviewed changes

mmabrouk closed this Jun 19, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(agent): deliver resolved tools through the runner#4765

feat(agent): deliver resolved tools through the runner#4765
mmabrouk wants to merge 1 commit into
feat/sdk-local-tools-servicefrom
feat/sdk-local-tools-runner-docs

mmabrouk commented Jun 19, 2026 •

edited

Loading

Uh oh!

vercel Bot commented Jun 19, 2026

Uh oh!

coderabbitai Bot commented Jun 19, 2026

Review skipped

Uh oh!

mmabrouk commented Jun 19, 2026

Uh oh!

mmabrouk Jun 19, 2026

Uh oh!

mmabrouk Jun 19, 2026

Uh oh!

mmabrouk Jun 19, 2026

Uh oh!

mmabrouk Jun 19, 2026

Uh oh!

mmabrouk commented Jun 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mmabrouk commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

This PR is part of a stack. Review bottom-up.

Context

What this changes

Key architectural decision to review

How to review this PR

Tests / notes

Uh oh!

vercel Bot commented Jun 19, 2026

Uh oh!

coderabbitai Bot commented Jun 19, 2026

Review skipped

Uh oh!

mmabrouk commented Jun 19, 2026

Reviewer guide: interesting code

Uh oh!

mmabrouk Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

mmabrouk Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

mmabrouk Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

mmabrouk Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

mmabrouk commented Jun 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mmabrouk commented Jun 19, 2026 •

edited

Loading