feat(agent): drive harnesses over ACP via the rivet sandbox-agent#4760
feat(agent): drive harnesses over ACP via the rivet sandbox-agent#4760mmabrouk wants to merge 1 commit into
Conversation
Re-platform the agent workflow service to drive coding harnesses (Pi, Claude Code) over the Agent Client Protocol through a rivet sandbox-agent daemon, behind the unchanged Harness port and /invoke contract. The harness (pi/claude) and sandbox (local/daytona) are editable playground config; tracing nests under the /invoke span; tools are delivered Pi-native via a bundled extension; and the model provider key resolves from the project vault.
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Reviewer guide: interesting codeA map to the decisions worth scrutinizing, in suggested reading order.
|
| endpoint: request.trace?.endpoint, | ||
| authorization: request.trace?.authorization, | ||
| captureContent: request.trace?.captureContent, | ||
| emitSpans: !isPi || isDaytona, |
There was a problem hiding this comment.
This is the split-tracing decision. Local Pi self-instruments through the Agenta extension under the propagated traceparent, so the runner only accumulates output here. Every other case (any non-Pi harness, or Pi on Daytona where the in-sandbox process cannot reach Agenta's OTLP) rebuilds the span tree from the ACP event stream. Worth confirming the reconstructed tree lines up with what Pi emits locally so a run does not look different depending on which axis it took.
| if (!t) return; | ||
| // Pi streams pure deltas; Claude streams deltas plus a cumulative snapshot. | ||
| // Replace when a chunk is a superset of what we have, append otherwise. | ||
| if (t.startsWith(accumulated)) accumulated = t; |
There was a problem hiding this comment.
Subtle cross-harness handling in one branch: Pi streams pure deltas while Claude streams deltas plus a cumulative snapshot, so the accumulator replaces when a chunk is a superset of what it has and appends otherwise. A harness whose streaming style fits neither assumption would corrupt the accumulated text, so this is the line to stress when adding a third harness.
| } | ||
|
|
||
|
|
||
| async def _resolve_harness_secrets() -> Dict[str, str]: |
There was a problem hiding this comment.
Provider keys are resolved here over HTTP from the project vault rather than from the SDK's per-request secret context, which does not propagate to this custom route. Check the degrade path: an empty vault must return {} and let the harness fall back to its own OAuth login, not fail the run, and keys must reach Daytona only via runtime env_vars (never baked into the image).
| function runAgent(request: AgentRunRequest): Promise<AgentRunResult> { | ||
| if (BACKEND === "rivet") return runRivet(request); | ||
| if (BACKEND === "pi") return runPi(request); | ||
| return request.harness || request.sandbox ? runRivet(request) : runPi(request); |
There was a problem hiding this comment.
The auto backend routes purely by envelope shape: a request carrying harness or sandbox goes to rivet, everything else to the legacy in-process Pi path. This is what lets one sidecar serve both runtimes, and it is the line that guarantees the Pi default cannot regress for callers that send neither field.
There was a problem hiding this comment.
Actionable comments posted: 10
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
services/oss/src/agent.py (1)
413-430:⚠️ Potential issue | 🟠 Major | ⚡ Quick winAvoid unconditional vault secret fetch on every invoke path.
Line 429 always calls
_resolve_harness_secrets(), including legacy Pi runs. That adds avoidable external I/O on the hot path and can regress latency/availability when the vault endpoint is slow.Suggested fix
harness_id = params.get("harness") sandbox_id = params.get("sandbox") harness = _build_harness(harness=harness_id, sandbox=sandbox_id) + harness_secrets: Dict[str, str] = {} + if isinstance(harness, RivetHarness): + harness_secrets = await _resolve_harness_secrets() await harness.setup() try: result = await harness.invoke( HarnessRequest( @@ custom_tools=custom_tools, tool_callback=tool_callback, trace=_trace_context(), - secrets=await _resolve_harness_secrets(), + secrets=harness_secrets, ) )
🧹 Nitpick comments (3)
docs/design/agent-workflows/wp-8-rivet-acp-runtime/research.md (1)
91-91: 💤 Low valueMinor: rephrase to reduce overuse of "exactly".
The static analysis tool flags the word "exactly" as overused. Consider a synonym like "matches" or "mirrors" for variety: "This matches how
DaytonaRunneralready injects..." This is a style note only and does not affect correctness.Source: Linters/SAST tools
docs/design/agent-workflows/wp-8-rivet-acp-runtime/status.md (1)
71-71: 💤 Low valueMinor: capitalize GitHub for brand consistency.
The static analysis tool flags the GitHub reference. Consider capitalizing to match official branding: "a real Composio
GitHubgithub_whoamitool..." (or keep the backtick to mark it as a code identifier). This is a style note only.Source: Linters/SAST tools
services/agent/src/toolBridge.ts (1)
66-68: ⚡ Quick winAdd a size guard before putting serialized tool specs into env.
AGENTA_TOOL_SPECScan grow enough to hit OS env-size limits, causing hard-to-debug process launch failures. A small preflight check with a clear error path will make this much safer.[recommendation]
Proposed patch
const env: EnvVariable[] = [ - { name: "AGENTA_TOOL_SPECS", value: JSON.stringify(specs) }, + { name: "AGENTA_TOOL_SPECS", value: JSON.stringify(specs) }, { name: "AGENTA_TOOL_CALLBACK_ENDPOINT", value: callback.endpoint }, ]; + const specsEnv = env[0].value; + // Keep a conservative cap to avoid E2BIG/argv+env spawn failures. + if (Buffer.byteLength(specsEnv, "utf8") > 64 * 1024) { + process.stderr.write("[tool-bridge] tool payload too large for env transport\n"); + return []; + }
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 8f922ea4-dca7-4f22-833d-c7bf66ea68df
⛔ Files ignored due to path filters (1)
services/agent/pnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (31)
.gitignoredocs/design/agent-workflows/README.mddocs/design/agent-workflows/wp-8-rivet-acp-runtime/README.mddocs/design/agent-workflows/wp-8-rivet-acp-runtime/architecture.mddocs/design/agent-workflows/wp-8-rivet-acp-runtime/context.mddocs/design/agent-workflows/wp-8-rivet-acp-runtime/isolation-and-fork.mddocs/design/agent-workflows/wp-8-rivet-acp-runtime/plan.mddocs/design/agent-workflows/wp-8-rivet-acp-runtime/poc/build_rivet_snapshot.pydocs/design/agent-workflows/wp-8-rivet-acp-runtime/poc/commit_agent_config.pydocs/design/agent-workflows/wp-8-rivet-acp-runtime/poc/debug-events.tsdocs/design/agent-workflows/wp-8-rivet-acp-runtime/poc/dump-full.tsdocs/design/agent-workflows/wp-8-rivet-acp-runtime/poc/package.jsondocs/design/agent-workflows/wp-8-rivet-acp-runtime/poc/spike.tsdocs/design/agent-workflows/wp-8-rivet-acp-runtime/research.mddocs/design/agent-workflows/wp-8-rivet-acp-runtime/status.mdhosting/docker-compose/ee/docker-compose.dev.ymlservices/agent/docker/Dockerfile.devservices/agent/package.jsonservices/agent/scripts/build-extension.mjsservices/agent/src/agenta-otel.tsservices/agent/src/cli.tsservices/agent/src/piExtension.tsservices/agent/src/runPi.tsservices/agent/src/runRivet.tsservices/agent/src/server.tsservices/agent/src/toolBridge.tsservices/agent/src/toolBridgeServer.tsservices/oss/src/agent.pyservices/oss/src/agent_pi/ports.pyservices/oss/src/agent_pi/rivet_harness.pyservices/oss/src/agent_pi/schemas.py
|
|
||
| const cwd = mkdtempSync(join(tmpdir(), "wp8-dbg-")); | ||
| writeFileSync(join(cwd, "AGENTS.md"), "You are concise.\n", "utf-8"); | ||
| const env: Record<string,string> = { PATH: `${binDir}:/home/mahmoud/.local/bin:${process.env.PATH ?? ""}`, PI_ACP_PI_COMMAND: join(binDir,"pi"), PI_CODING_AGENT_DIR: join(process.env.HOME??"",".pi/agent"), SANDBOX_AGENT_BIN: BIN, HOME: process.env.HOME??"" }; |
There was a problem hiding this comment.
Remove hardcoded personal developer path before commit.
Line 15 contains /home/mahmoud/.local/bin, a personal home directory that should not be baked into committed code. Replace with an environment-relative or system-standard path, or remove the hardcoded segment entirely and let PATH resolution handle it.
🗑️ Proposed fix to remove personal path
-const env: Record<string,string> = { PATH: `${binDir}:/home/mahmoud/.local/bin:${process.env.PATH ?? ""}`, PI_ACP_PI_COMMAND: join(binDir,"pi"), PI_CODING_AGENT_DIR: join(process.env.HOME??"",".pi/agent"), SANDBOX_AGENT_BIN: BIN, HOME: process.env.HOME??"" };
+const env: Record<string,string> = { PATH: `${binDir}:${process.env.PATH ?? ""}`, PI_ACP_PI_COMMAND: join(binDir,"pi"), PI_CODING_AGENT_DIR: join(process.env.HOME??"",".pi/agent"), SANDBOX_AGENT_BIN: BIN, HOME: process.env.HOME??"" };📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const env: Record<string,string> = { PATH: `${binDir}:/home/mahmoud/.local/bin:${process.env.PATH ?? ""}`, PI_ACP_PI_COMMAND: join(binDir,"pi"), PI_CODING_AGENT_DIR: join(process.env.HOME??"",".pi/agent"), SANDBOX_AGENT_BIN: BIN, HOME: process.env.HOME??"" }; | |
| const env: Record<string,string> = { PATH: `${binDir}:${process.env.PATH ?? ""}`, PI_ACP_PI_COMMAND: join(binDir,"pi"), PI_CODING_AGENT_DIR: join(process.env.HOME??"",".pi/agent"), SANDBOX_AGENT_BIN: BIN, HOME: process.env.HOME??"" }; |
Source: Coding guidelines
| const BIN = join(here, "node_modules/.pnpm/@sandbox-agent+cli-linux-x64@0.4.2/node_modules/@sandbox-agent/cli-linux-x64/bin/sandbox-agent"); | ||
| const cwd = mkdtempSync(join(tmpdir(), "wp8-dump-")); | ||
| writeFileSync(join(cwd, "AGENTS.md"), "You are concise.\n", "utf-8"); | ||
| const env: Record<string,string> = { PATH: `${binDir}:/home/mahmoud/.local/bin:${process.env.PATH ?? ""}`, PI_ACP_PI_COMMAND: join(binDir,"pi"), PI_CODING_AGENT_DIR: join(process.env.HOME??"",".pi/agent"), SANDBOX_AGENT_BIN: BIN, HOME: process.env.HOME??"" }; |
There was a problem hiding this comment.
Remove hardcoded personal developer path before commit.
Line 13 contains /home/mahmoud/.local/bin, identical to the issue in debug-events.ts. This personal home directory must not be committed.
🗑️ Proposed fix to remove personal path
-const env: Record<string,string> = { PATH: `${binDir}:/home/mahmoud/.local/bin:${process.env.PATH ?? ""}`, PI_ACP_PI_COMMAND: join(binDir,"pi"), PI_CODING_AGENT_DIR: join(process.env.HOME??"",".pi/agent"), SANDBOX_AGENT_BIN: BIN, HOME: process.env.HOME??"" };
+const env: Record<string,string> = { PATH: `${binDir}:${process.env.PATH ?? ""}`, PI_ACP_PI_COMMAND: join(binDir,"pi"), PI_CODING_AGENT_DIR: join(process.env.HOME??"",".pi/agent"), SANDBOX_AGENT_BIN: BIN, HOME: process.env.HOME??"" };📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const env: Record<string,string> = { PATH: `${binDir}:/home/mahmoud/.local/bin:${process.env.PATH ?? ""}`, PI_ACP_PI_COMMAND: join(binDir,"pi"), PI_CODING_AGENT_DIR: join(process.env.HOME??"",".pi/agent"), SANDBOX_AGENT_BIN: BIN, HOME: process.env.HOME??"" }; | |
| const env: Record<string,string> = { PATH: `${binDir}:${process.env.PATH ?? ""}`, PI_ACP_PI_COMMAND: join(binDir,"pi"), PI_CODING_AGENT_DIR: join(process.env.HOME??"",".pi/agent"), SANDBOX_AGENT_BIN: BIN, HOME: process.env.HOME??"" }; |
Source: Coding guidelines
| AGENTA_HOST: ${AGENTA_HOST:-http://144.76.237.122:8280} | ||
| AGENTA_API_KEY: ${AGENTA_API_KEY:-} |
There was a problem hiding this comment.
Remove the hardcoded public AGENTA_HOST default to prevent accidental data egress.
Line 446 defaults to http://144.76.237.122:8280, which can send trace payloads to an external endpoint when env is unset. That is a privacy/security footgun for local dev stacks.
Suggested fix
- AGENTA_HOST: ${AGENTA_HOST:-http://144.76.237.122:8280}
+ AGENTA_HOST: ${AGENTA_HOST:-}| callRef: string; | ||
| } | ||
|
|
||
| const TOOL_CALL_TIMEOUT_MS = Number(process.env.AGENTA_AGENT_TOOL_CALL_TIMEOUT_MS ?? 30000); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
find . -name "piExtension.ts" -type fRepository: Agenta-AI/agenta
Length of output: 96
🏁 Script executed:
cat -n ./services/agent/src/piExtension.ts | head -80Repository: Agenta-AI/agenta
Length of output: 3861
🏁 Script executed:
cat -n ./services/agent/src/piExtension.ts | sed -n '38p;58p;59p;60p;61p'Repository: Agenta-AI/agenta
Length of output: 408
🌐 Web query:
AbortSignal.timeout() behavior with NaN milliseconds
💡 Result:
The behavior of passing NaN to AbortSignal.timeout is governed by its WebIDL definition, which uses the [EnforceRange] extended attribute on the milliseconds argument [1][2][3]. When a method argument is marked with [EnforceRange], passing a value that cannot be converted to an integer—such as NaN—results in a TypeError being thrown immediately. Consequently, AbortSignal.timeout(NaN) does not return a signal or create a timer; it throws a TypeError [1][2]. In general, the milliseconds argument for AbortSignal.timeout must be a non-negative number within the range of 0 to Number.MAX_SAFE_INTEGER [1][2]. Values outside this range or invalid types (that cannot be coerced into valid integers via EnforceRange) will cause the function to throw.
Citations:
- 1: https://developer.mozilla.org/en-US/docs/Web/API/AbortSignal/timeout_static
- 2: https://github.com/mdn/content/blob/main/files/en-us/web/api/abortsignal/timeout_static/index.md
- 3: Add AbortSignal.timeout(ms) whatwg/dom#951
Add timeout validation to prevent runtime TypeError on invalid env values.
At Line 38, a non-numeric AGENTA_AGENT_TOOL_CALL_TIMEOUT_MS yields NaN; at Line 58 this fails because AbortSignal.timeout(NaN) throws a TypeError due to the [EnforceRange] WebIDL constraint.
Proposed fix
-const TOOL_CALL_TIMEOUT_MS = Number(process.env.AGENTA_AGENT_TOOL_CALL_TIMEOUT_MS ?? 30000);
+const parsedTimeout = Number(process.env.AGENTA_AGENT_TOOL_CALL_TIMEOUT_MS);
+const TOOL_CALL_TIMEOUT_MS =
+ Number.isFinite(parsedTimeout) && parsedTimeout > 0 ? parsedTimeout : 30000;| /** Continue a prior run by replaying its history. The rivet backend resumes by id. */ | ||
| sessionId?: string; |
There was a problem hiding this comment.
sessionId contract is currently not honored by the rivet runtime path.
Line 91 says resume-by-id is supported, but services/agent/src/runRivet.ts:531-698 never reads request.sessionId and always starts with a fresh in-memory persist driver. This makes the advertised continuation behavior incorrect for callers.
Please either wire request.sessionId into session restore logic or remove/soften this contract until restore is implemented.
| // Adapters (pi-acp, claude-agent-acp) and the pi CLI live in our node_modules/.bin; | ||
| // claude CLI is on the inherited PATH. Prepend ours, keep the inherited PATH. | ||
| const extra = process.env.AGENTA_RIVET_ADAPTER_PATH; | ||
| env.PATH = [ADAPTER_BIN_DIR, extra, process.env.PATH].filter(Boolean).join(":"); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Check if the file exists and read the context around line 193
head -200 services/agent/src/runRivet.ts | tail -30Repository: Agenta-AI/agenta
Length of output: 1474
🏁 Script executed:
# Also check the imports at the top of the file
head -30 services/agent/src/runRivet.tsRepository: Agenta-AI/agenta
Length of output: 1448
🏁 Script executed:
# Search for other PATH-related assignments in the codebase
rg "env\.PATH\s*=" --type ts --type js -B 2 -A 2Repository: Agenta-AI/agenta
Length of output: 513
🌐 Web query:
node:path delimiter function cross-platform Windows Unix PATH separator
💡 Result:
In Node.js, the path module provides two distinct properties to handle platform-specific separators, which are essential for writing cross-platform code [1][2]. 1. path.delimiter: This property provides the platform-specific delimiter used to separate multiple paths in environment variables like PATH [1][3]. - On Windows, it is a semicolon (;) [1][2]. - On POSIX (Unix-like systems), it is a colon (:) [1][2]. 2. path.sep: This property provides the platform-specific path segment separator used within file paths (e.g., between directory names) [1][2]. - On Windows, it is a backslash () [1][3]. - On POSIX, it is a forward slash (/) [1][2]. While Windows accepts both forward slashes (/) and backslashes () as valid path separators in many contexts, Node.js path methods exclusively use the platform-specific separator (backslash on Windows) when generating or joining paths [1][2]. For code intended to run across different operating systems, using these properties ensures correct behavior for the host environment [1][4].
Citations:
- 1: https://nodejs.org/api/path.html
- 2: https://nodejs.org/docs/v22.14.0/api/path.html
- 3: https://nodejs.org/docs/v23.4.0/api/path.html
- 4: https://github.com/ehmicky/cross-platform-node-guide/blob/main/docs/3_filesystem/file_paths.md
🏁 Script executed:
# Get the imports section more carefully
rg "import.*from.*node:path" services/agent/src/runRivet.ts -A 5 -B 5Repository: Agenta-AI/agenta
Length of output: 440
🏁 Script executed:
# If not found in that way, check the first 50 lines more carefully
head -50 services/agent/src/runRivet.ts | cat -nRepository: Agenta-AI/agenta
Length of output: 2475
🏁 Script executed:
# Check for other potential hardcoded PATH delimiters (colon or semicolon in PATH context)
rg "join\s*\(\s*['\"][:;]['\"]" --type ts --type jsRepository: Agenta-AI/agenta
Length of output: 472
Use OS-aware PATH delimiter instead of hardcoded colon.
At line 193, the PATH environment variable is joined with a hardcoded ":", which breaks on Windows (which uses ";" as the PATH delimiter). This breaks subprocess execution on Windows despite win32 CLI packages being explicitly supported.
Proposed fix
-import { dirname, join } from "node:path";
+import { delimiter, dirname, join } from "node:path";
...
- env.PATH = [ADAPTER_BIN_DIR, extra, process.env.PATH].filter(Boolean).join(":");
+ env.PATH = [ADAPTER_BIN_DIR, extra, process.env.PATH].filter(Boolean).join(delimiter);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| env.PATH = [ADAPTER_BIN_DIR, extra, process.env.PATH].filter(Boolean).join(":"); | |
| import { delimiter, dirname, join } from "node:path"; | |
| ... | |
| env.PATH = [ADAPTER_BIN_DIR, extra, process.env.PATH].filter(Boolean).join(delimiter); |
| export async function runRivet(request: AgentRunRequest): Promise<AgentRunResult> { | ||
| const harness = request.harness || process.env.AGENTA_AGENT_HARNESS || "pi"; | ||
| const sandboxId = request.sandbox || process.env.AGENTA_AGENT_SANDBOX || "local"; |
There was a problem hiding this comment.
sessionId continuation is not wired, so resume semantics are currently broken.
request.sessionId is never consumed, a fresh in-memory persist driver is created per invoke, and the response always returns a newly created session.id. This breaks the new continuation contract across harness boundaries and can fragment conversation/session lineage.
Also applies to: 586-587, 627-631, 684-685
| process.on("uncaughtException", (err) => { | ||
| process.stderr.write(`[pi-wrapper] uncaughtException: ${err.stack ?? err.message}\n`); | ||
| }); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Check if the file exists and view the code around lines 85-87
if [ -f "services/agent/src/server.ts" ]; then
echo "=== File exists ==="
wc -l "services/agent/src/server.ts"
echo ""
echo "=== Code at lines 80-95 ==="
sed -n '80,95p' "services/agent/src/server.ts"
else
echo "File not found at services/agent/src/server.ts"
# Try to find it
echo "Searching for server.ts files..."
find . -name "server.ts" -type f 2>/dev/null | head -20
fiRepository: Agenta-AI/agenta
Length of output: 587
🏁 Script executed:
# Also search for other uncaughtException handlers in the codebase
echo "=== Searching for uncaughtException handlers ==="
rg "uncaughtException" -n --context 2Repository: Agenta-AI/agenta
Length of output: 447
🌐 Web query:
Node.js uncaughtException handler best practices should process exit
💡 Result:
The official Node.js documentation and industry best practices strongly advise that the 'uncaughtException' event should be used only as a last resort to perform synchronous cleanup of resources before shutting down the process [1][2]. Key best practices include: 1. Never attempt to resume normal operation: An uncaught exception places the application in an undefined state [1][3]. Continuing execution after an uncaught exception is widely recognized as an anti-pattern that can lead to data corruption, memory leaks, or unpredictable behavior because the application's internal state (such as database connection pools or in-flight requests) may be broken [4]. 2. Use the handler for cleanup only: If you implement an 'uncaughtException' listener, its primary purpose should be to perform synchronous cleanup of allocated resources, such as closing file descriptors or database handles [1][3]. 3. Terminate the process: After cleanup, you must terminate the process [4]. By default, Node.js prints the stack trace and exits with code 1 [1][2]. If you add a listener, you override this behavior and the process will exit with 0 unless you explicitly manage the exit [1][3]. To maintain correct signaling to external monitoring tools, you should explicitly call process.exit(1) after your cleanup logic [4]. 4. Rely on external process managers: Since the process should always be terminated after an uncaught exception, you should use an external process manager (e.g., PM2, Forever, or container orchestration like Kubernetes) to monitor the application and automatically restart it in a clean state [1][5][4]. 5. Do not use as a generic error handler: 'uncaughtException' is not a mechanism for normal error handling. Expected errors should be managed via try/catch blocks, error-first callbacks, or promise rejections [4]. In summary, the correct pattern is to log the error with sufficient context, perform necessary synchronous cleanup, and then call process.exit(1) to trigger a restart by your process manager [4].
Citations:
- 1: https://nodejs.org/api/process.html
- 2: https://nodejs.org/dist/latest/docs/api/process.html
- 3: https://nodejs.org/api/process.md
- 4: https://krun.pro/nodejs-uncaught-exception-handling/
- 5: https://github.com/goldbergyoni/nodebestpractices/blob/c0b71ccf/sections/errorhandling/shuttingtheprocess.md
Process must terminate after uncaughtException to prevent undefined state.
The current handler at lines 85-87 only logs and continues. Per Node.js best practices, uncaught exceptions place the process in an undefined state, and continuing execution is an anti-pattern that risks data corruption, memory leaks, and cascading failures. The process must explicitly terminate and rely on external process managers for restart.
Safer pattern
process.on("uncaughtException", (err) => {
process.stderr.write(`[pi-wrapper] uncaughtException: ${err.stack ?? err.message}\n`);
+ process.exitCode = 1;
+ setImmediate(() => process.exit(1));
});📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| process.on("uncaughtException", (err) => { | |
| process.stderr.write(`[pi-wrapper] uncaughtException: ${err.stack ?? err.message}\n`); | |
| }); | |
| process.on("uncaughtException", (err) => { | |
| process.stderr.write(`[pi-wrapper] uncaughtException: ${err.stack ?? err.message}\n`); | |
| process.exitCode = 1; | |
| setImmediate(() => process.exit(1)); | |
| }); |
| callRef: string; | ||
| } | ||
|
|
||
| const SPECS: ToolSpec[] = JSON.parse(process.env.AGENTA_TOOL_SPECS ?? "[]"); |
There was a problem hiding this comment.
Avoid process crash on malformed AGENTA_TOOL_SPECS.
At Line 26, JSON.parse at module scope can terminate the bridge process before it serves requests. Parse defensively and fallback to [].
Proposed fix
-const SPECS: ToolSpec[] = JSON.parse(process.env.AGENTA_TOOL_SPECS ?? "[]");
+let SPECS: ToolSpec[] = [];
+try {
+ const parsed = JSON.parse(process.env.AGENTA_TOOL_SPECS ?? "[]");
+ SPECS = Array.isArray(parsed) ? parsed : [];
+} catch {
+ SPECS = [];
+}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const SPECS: ToolSpec[] = JSON.parse(process.env.AGENTA_TOOL_SPECS ?? "[]"); | |
| let SPECS: ToolSpec[] = []; | |
| try { | |
| const parsed = JSON.parse(process.env.AGENTA_TOOL_SPECS ?? "[]"); | |
| SPECS = Array.isArray(parsed) ? parsed : []; | |
| } catch { | |
| SPECS = []; | |
| } |
| const ENDPOINT = process.env.AGENTA_TOOL_CALLBACK_ENDPOINT ?? ""; | ||
| const AUTH = process.env.AGENTA_TOOL_CALLBACK_AUTH; | ||
| const SPEC_BY_NAME = new Map(SPECS.map((s) => [s.name, s])); | ||
| const TOOL_CALL_TIMEOUT_MS = Number(process.env.AGENTA_AGENT_TOOL_CALL_TIMEOUT_MS ?? 30000); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
fd -t f "toolBridgeServer.ts" --exec wc -l {}Repository: Agenta-AI/agenta
Length of output: 105
🏁 Script executed:
head -80 services/agent/src/toolBridgeServer.ts | cat -nRepository: Agenta-AI/agenta
Length of output: 3790
🏁 Script executed:
# Verify AbortSignal.timeout behavior with NaN
python3 << 'PY'
# Check what AbortSignal.timeout does with NaN
# According to ECMAScript spec, AbortSignal.timeout(delay) expects:
# - delay to be a number >= 0
# - If NaN is passed, it would typically be coerced or throw
# The concern is valid - NaN is not a valid timeout value
print("AbortSignal.timeout() spec requirements:")
print("- Expects a non-negative number")
print("- NaN would violate this contract")
print("- Current code can produce NaN if env var is set to non-numeric string")
print("\nExample issue scenarios:")
print("1. AGENTA_AGENT_TOOL_CALL_TIMEOUT_MS='invalid' -> Number('invalid') = NaN -> AbortSignal.timeout(NaN) fails")
print("2. AGENTA_AGENT_TOOL_CALL_TIMEOUT_MS='-500' -> Number('-500') = -500 -> AbortSignal.timeout(-500) likely fails (negative)")
PYRepository: Agenta-AI/agenta
Length of output: 494
Guard timeout env parsing to prevent NaN at runtime.
Line 30 can produce NaN if AGENTA_AGENT_TOOL_CALL_TIMEOUT_MS is set to a non-numeric value. This would cause AbortSignal.timeout() at line 60 to fail. Validate that the parsed timeout is a positive finite number before use.
Proposed fix
-const TOOL_CALL_TIMEOUT_MS = Number(process.env.AGENTA_AGENT_TOOL_CALL_TIMEOUT_MS ?? 30000);
+const parsedTimeout = Number(process.env.AGENTA_AGENT_TOOL_CALL_TIMEOUT_MS);
+const TOOL_CALL_TIMEOUT_MS =
+ Number.isFinite(parsedTimeout) && parsedTimeout > 0 ? parsedTimeout : 30000;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const TOOL_CALL_TIMEOUT_MS = Number(process.env.AGENTA_AGENT_TOOL_CALL_TIMEOUT_MS ?? 30000); | |
| const parsedTimeout = Number(process.env.AGENTA_AGENT_TOOL_CALL_TIMEOUT_MS); | |
| const TOOL_CALL_TIMEOUT_MS = | |
| Number.isFinite(parsedTimeout) && parsedTimeout > 0 ? parsedTimeout : 30000; |
|
Superseded. Replacing the path-based stack with PRs sliced by functional area showing final code only, so reviewers don't comment on intermediate scaffolding that a later PR rewrites. See the new set. |
This PR is part of a stack. Review bottom-up.
Each PR's diff is only its own delta. Merge from the bottom. This PR's base is #4759 (merge that first).
Context
This adds a second way to run a coding harness behind the same
Harnessport: drive it over the Agent Client Protocol (ACP) through a rivetsandbox-agentdaemon, instead of the in-process Pi SDK calls. The harness (pi/claude) and the sandbox (local/daytona) become two independent config axes the caller sets per run, so switching engine or environment needs no redeploy and no new code path. Base branch:feat/agent-tools-wp7. This is slice #3 indocs/design/agent-workflows/pr-stack.md(runner streaming boundary, the second runtime path).What this changes
A new
AGENTA_AGENT_RUNTIME=rivetselects the rivet path; the default stayspiso the legacy in-process route never regresses._build_harness()gained two args. Before, it returned a Pi adapter from env alone. Now, when the runtime isrivet, it returns aRivetHarnesscarrying the chosenharnessandsandbox, taken from the request config first and theAGENTA_AGENT_HARNESS/AGENTA_AGENT_SANDBOXenv defaults second.harness(enum pi/claude) andsandbox(enum local/daytona) fields, so a playground run can switch either axis.harness,sandbox,sessionId, andsecrets; everything else (agentsMd, model, messages, tools, customTools, toolCallback, trace) stays identical, so the Python side stays thin.auto: a request that carriesharness/sandboxgoes torunRivet, everything else torunPi, so one sidecar serves both.runRivet.tsis the new driver. Per invoke it starts a cold sandbox, creates an ACP session for the chosen harness, writesAGENTS.mdinto the cwd, sends the turn, accumulates the streamed text, and tears the sandbox down.agenta-otel.ts createRivetOtel), so the run nests under the caller's/invokespan, the same shape the in-process path produces./invokeworkflow span. The harness span tree exports in a separate OTLP batch, so Agenta's per-batch roll-up cannot bridge the totals; the service stampsgen_ai.usage.*directly on the workflow span instead.Key architectural decision to review
The most important one is the split tracing strategy in
runRivet.tsandagenta-otel.ts, keyed onemitSpans: !isPi || isDaytona(runRivet.ts:646). Local Pi self-instruments: the Agenta extension propagates the traceparent and emits real turn/chat/tool spans from inside Pi, so the runner only accumulates output. Every other case (any non-Pi harness, or Pi on Daytona where the in-sandbox process cannot reach Agenta's OTLP) builds the span tree in the runner from the ACPsession/updatenotifications. The tradeoff: the runner's reconstructed tree is an approximation of what the harness actually did (one chat span per turn, tool spans pertool_call), not the harness's own instrumentation, but it makes tracing uniform across harnesses that do not self-instrument. Confirm the two paths produce a comparable tree and that nothing double-counts when Pi runs locally.Second, look at secret resolution and auth fallback.
_resolve_harness_secrets(agent.py:154) fetches the project vault'sprovider_keysecrets over HTTP and maps each to its standard env var, because the SDK's per-request secret context does not reach this custom route. When a provider key is present the harness authenticates with it; when absent it falls back to its own login (local Codex OAuth, or an uploadedauth.jsonon Daytona). Check that an empty vault degrades to OAuth rather than failing, and that keys never get baked into a Daytona image.How to review this PR
services/oss/src/agent.pyfirst. Read_build_harnessto see the runtime/harness/sandbox selection, then_resolve_harness_secretsand_record_usage. This is where the two axes and the usage roll-up are decided.services/oss/src/agent_pi/rivet_harness.py. It is a thin adapter: same port, two transports (HTTP sidecar, local subprocess), envelope-building only.services/agent/src/runRivet.tsfor the real driver. ReadrunRivetend to end (the cold-per-invoke lifecycle), thenbuildSandboxProvider(local vs Daytona),onPermissionRequest(headless auto-approve), andapplyModel(model rejection handling).services/agent/src/agenta-otel.tshandleUpdatefor the event-to-span mapping. Note the Pi-vs-Claude chunk handling: Pi streams pure deltas, Claude streams deltas plus a cumulative snapshot, so the accumulator replaces when a chunk is a superset and appends otherwise (agenta-otel.ts:759).docs/POC files andpnpm-lock.yaml; they are reference material and the dependency lock.The regression most likely to break is the default
piruntime. The whole point of theruntime/auto-backend gating is that a deployment withoutAGENTA_AGENT_RUNTIME=rivetand a request withoutharness/sandboxstill takes the exact in-process Pi path. Verify that selection logic inagent.py:66andserver.ts:29.Tests / notes
This path was host-verified driving pi and claude over ACP via
sandbox-agent, with the event-stream trace nesting under/invoke. Watch two operational facts on Daytona: the run uses a pre-baked snapshot that carries the rivet daemon plus thepiCLI (the rivet-fullimage lackspi), and the runner installs a cookie-jarfetchso the Daytona preview proxy does not 401 later ACP requests. The server now swallows backgroundunhandledRejection/uncaughtExceptionfrom the rivet SDK so one failing run cannot take down every in-flight request on the shared sidecar.