Skip to content

fix(codex): writable_roots TOML override + fallback to worktree-relative decision path#188

Merged
jacsamell merged 1 commit into
mainfrom
fix/codex-writable-roots-and-fallback-prompt
May 19, 2026
Merged

fix(codex): writable_roots TOML override + fallback to worktree-relative decision path#188
jacsamell merged 1 commit into
mainfrom
fix/codex-writable-roots-and-fallback-prompt

Conversation

@jacsamell
Copy link
Copy Markdown
Contributor

@jacsamell jacsamell commented May 19, 2026

Two-layer fix for codex judges silently failing to write decision files.

Belt: -c 'sandbox_workspace_write.writable_roots=["<main_repo>"]' to let codex's file_change tool write to the canonical path. --add-dir (used in PR #187) doesn't grant write to file_change in codex 0.128.0 — verified empirically.

Suspenders: decision-file prompt now lists a fallback relative path (.prompts/decisions/... resolved against the judge's worktree CWD). find_decision_file already scans worktrees and copies to canonical. If writable_roots ever breaks in a future codex version, judges still write somewhere cube picks up.

Net: codex judges should now reliably persist decisions. Tested live: a manual codex exec with the TOML override successfully wrote a test file to the main repo's .prompts/decisions/.

Summary

This PR addresses codex judge failures to persist decision files by implementing a two-layer fix:

Layer 1 — TOML Override: Replaces the ineffective --add-dir approach with a -c sandbox_workspace_write.writable_roots=["<project_root>"] TOML configuration override in CodexAdapter._run_once(). This grants the codex file_change tool direct write access to the main repository's canonical decision path (.prompts/decisions/).

Layer 2 — Fallback Path: Updates decision-file prompt instructions in prompts.py to advertise both the primary canonical absolute path and a fallback worktree-relative path. Since find_decision_file already scans worktrees and copies files to the canonical location, judges will still persist decisions even if writable_roots fails in future codex versions.

Testing: Test assertions in test_adapters.py updated to validate the new -c writable_roots configuration pattern instead of --add-dir. Manual testing confirmed successful file writes to the canonical path with the TOML override.

Net Effect: Decision files are reliably persisted to either the canonical location (direct write) or a worktree-relative fallback that cube automatically discovers, with no duplicate writes.

Review Change Stack

…ive decision path

Two-layer fix for codex judges failing to write decision files:

1. Belt: extend sandbox writable_roots to the main repo via -c TOML override so codex's file_change tool can write to <main>/.prompts/decisions/ directly. --add-dir DOESN'T work for file_change in codex 0.128.0 (only for shell commands); writable_roots does.

2. Suspenders: update the decision-file prompt to advertise a fallback path '.prompts/decisions/...' relative to CWD (the worktree). find_decision_file already scans worktrees and copies to canonical. So if writable_roots ever stops working in a future codex version, judges still land their decisions somewhere cube picks up.

Either landing place works; primary avoids the copy step.
@jacsamell jacsamell merged commit 67897ff into main May 19, 2026
@jacsamell jacsamell deleted the fix/codex-writable-roots-and-fallback-prompt branch May 19, 2026 23:36
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 19, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 33e31136-47a6-46c1-8d18-2936c63e62c6

📥 Commits

Reviewing files that changed from the base of the PR and between 9ff337b and 6f4a72f.

📒 Files selected for processing (3)
  • python/cube/automation/prompts.py
  • python/cube/core/adapters/codex.py
  • tests/cli/test_adapters.py

Walkthrough

The PR switches Codex sandbox write-permission configuration from a --add-dir flag to a -c sandbox_workspace_write.writable_roots=[] TOML override. Decision-file instructions now document canonical and fallback paths, and test assertions validate the new command structure.

Changes

Sandbox Write-Permission Configuration

Layer / File(s) Summary
Decision-file creation instructions with fallback path guidance
python/cube/automation/prompts.py
Decision-file template specifies a canonical absolute path and explicit fallback relative path for restricted sandboxes, noting that Cube automatically populates the canonical location during worktree operations.
Codex adapter sandbox write-permission configuration
python/cube/core/adapters/codex.py
When worktree differs from project root, CodexAdapter._run_once now extends the codex exec command with -c sandbox_workspace_write.writable_roots=["<project_root>"] to replace the prior --add-dir approach.
Test assertions for sandbox configuration override
tests/cli/test_adapters.py
test_codex_adapter_run_command and test_codex_adapter_resume_command are updated to assert the -c configuration override with writable_roots instead of --add-dir when worktree differs from project root.

Estimated Code Review Effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly Related PRs

  • aetheronhq/agent-cube#171: Modifies CodexAdapter._run_once and resume fallback behaviour in the same integration layer, sharing the same adapter command construction concerns.

Poem

🐰 A sandbox permission dance,
From --add-dir to config override chance,
Paths canonical, fallbacks in place,
Tests validate the whole command space!


Comment @coderabbitai help to get the list of available commands and usage tips.

jacsamell added a commit that referenced this pull request May 23, 2026
Wire @openai/codex-sdk and @cursor/sdk through the bridge. cube users
can now opt into the SDK adapters with `cli: codex-sdk` or
`cli: cursor-sdk` per judge. Default unchanged (cli: codex / cursor
still use the legacy CLI adapters).

## Codex provider (sdk-bridge/src/providers/codex.ts)
- @openai/codex-sdk@0.133.0 pinned
- Codex.startThread + thread.runStreamed drives the run
- Translates ThreadEvent stream → Claude-shape JSON lines so cube's
  existing parse_stream_message handles them unchanged:
    thread.started     → system:init (carries thread_id as session_id)
    item.completed     → assistant (text or thinking blocks)
    item.* (file_change|command_execution|mcp_tool_call|web_search)
                       → tool_use
    turn.completed     → result:success
    turn.failed        → error + result:error_during_execution
- Model alias table mirrored from codex.py (gpt-5.5, gpt-5.5-codex,
  gpt-5.5-high/medium/low/extra-high)
- writable_roots (PR #188's TOML hack) → SDK's `additionalDirectories`.
  Python adapter passes PROJECT_ROOT via --additional-dir when worktree
  differs from main repo, so judges still land decision files there.
- Stale-rollout fallback: codex SDK throws plain Error with stderr
  text when resume fails. isStaleRolloutError() substring-matches the
  same phrasings codex.py uses today; on hit, emit a synthetic
  system:stale_session note and fall back to a fresh thread (instead
  of the operator-blocking hard crash).
- env: { ...process.env } stripped of undefineds — codex SDK's env
  option REPLACES process env entirely if set (same footgun as Claude).

## Cursor provider (sdk-bridge/src/providers/cursor.ts)
- @cursor/sdk@1.0.13 pinned
- IMPORTANT: cursor SDK runs IN-PROCESS, not as a subprocess wrapper
  like Claude/Codex. The agent runtime lives inside our Node process.
  Per-judge isolation is preserved because we spawn one Node process
  per judge anyway (concurrency stays in Python asyncio.gather).
- Agent.create({ local: { cwd, sandboxOptions: { enabled: true } } })
  for fresh runs; Agent.resume(id, ...) for resume.
- Translates Cursor's SDKMessage (system/assistant/thinking/tool_call/
  status) → Claude-shape JSON lines.
- Cursor has real typed exceptions (AuthenticationError, RateLimitError,
  NetworkError, UnknownAgentError, AgentBusyError) all with isRetryable
  flag. categoriseError() maps them to cube-friendly category tags.
- Model alias: cursor-composer/cursor-composer-2.5/cursor → composer-2.5
  (Cursor SDK requires explicit model id for local mode).
- Auth via CURSOR_API_KEY env var (SDK reads it directly).

## CLI surface
- New --additional-dir flag (repeatable string array). Codex provider
  consumes; Claude/Cursor providers ignore (declared in their RunOptions
  for shared type compat).

## Python adapter changes (sdk_bridge.py)
- Added _additional_dirs(worktree) — returns [PROJECT_ROOT] for the
  codex provider when worktree != project root, [] otherwise.
- check_installed() now also verifies sdk-bridge/node_modules exists,
  since dist/cli.js externalises the vendor SDKs (see build change
  below). Surfaces missing deps at adapter-lookup time instead of as
  a cryptic "Cannot find module" inside the spawned Node process.

## Build change — externalised vendor SDKs
- bun build now invoked with --external @anthropic-ai/claude-agent-sdk
  --external @openai/codex-sdk --external @cursor/sdk --external yargs.
- Cursor SDK alone bundles to ~10MB (in-process agent runtime). Inlining
  all three would commit ~10MB+ binary blob churning on every rebuild.
- dist/cli.js now 13KB (was 1.14MB in Phase 1). Vendor SDKs resolve
  from sdk-bridge/node_modules at runtime — requires `bun install`
  once per checkout. Documented in get_install_instructions().

## Tests
- 2 new tests in test_sdk_bridge_adapter.py — codex receives
  --additional-dir <PROJECT_ROOT> when worktree differs; claude does
  not. 12/12 sdk-bridge tests pass. 117/117 in tests/cli/ pass.
- bun run typecheck: clean.
- Real smoke test: `node dist/cli.js --provider claude --prompt-file
  <real-prompt>` reaches Anthropic's API end-to-end (returned 400
  on empty prompt input — expected). `--provider codex` spawned the
  codex binary and surfaced its empty-stdin complaint as a translated
  error event.

Phase 3 dogfood is the next step — needs real-world runs against PRs
to validate error categorisation parity vs the legacy adapters.

Refs: docs/plans/2026-05-cli-to-sdk-migration.md
      docs/plans/impl/phase-2-codex-and-cursor-providers.md

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
jacsamell added a commit that referenced this pull request May 24, 2026
THE root cause of judge_4 (codex-sdk) failing on every cube prv 1388
across multiple days of Phase 3 dogfood. Codex's sandbox was telling
the truth: `additionalDirectories` maps to `--add-dir <path>` which
grants READ access only. WRITES to those paths get "Operation not
permitted" — codex literally cannot write to the main repo from a
worktree CWD.

PR #188 fixed this for the legacy codex.py CLI adapter by adding
`-c sandbox_workspace_write.writable_roots=[<project_root>]` via
`--config` overrides. The SDK provider never got the port.

The @openai/codex-sdk's Codex({...}) constructor accepts a `config`
field documented as: "Additional `--config key=value` overrides ...
Provide a JSON object and the SDK will flatten it into dotted paths
and serialize values as TOML literals."

Passing { sandbox_workspace_write: { writable_roots: [...] } } in
the config object reaches the codex CLI as
`-c sandbox_workspace_write.writable_roots=[...]` — same effective
shape as the legacy adapter.

Keeping `additionalDirectories` on startThread too — read access
doesn't hurt and may be the correct SDK mechanism in future versions.

Evidence: ~/.cube/logs/judge-judge_4-pr-1388-peer-review-1779582290.json
Codex's exact words: "I cannot write the exact canonical path from this
session. The sandbox blocks it with `Operation not permitted` because
`/Users/jacob/dev/aetheron-connect-v2` is outside the writable roots."

Loop disclosure: panel review was blocked by Claude weekly quota
(cube identity hit limit; resets May 25 23:30 ACST). Spec was
written, cube auto fired, all 5 judges failed with auth errors before
even reaching the writer. Shipping the 3-line fix directly because
the writer phase couldn't progress and this is a P0 dogfood blocker.
PR #220's file_change tracking + PR #218's auto-approve gate guard
provide safety net coverage in lieu of synchronous panel review;
re-review via cube auto on resume once quota resets.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant