chore(agent): docker cleanups for the sandbox-agent sidecar#4762
chore(agent): docker cleanups for the sandbox-agent sidecar#4762mmabrouk wants to merge 1 commit into
Conversation
- Add a production, credential-free sidecar Dockerfile (services/agent/docker/Dockerfile): bakes Pi (MIT), never bakes Claude Code or any credential, runs src/server.ts without a watcher. Verified to build and serve /health. - Add services/agent/docker/README.md documenting the image licensing posture (bake Pi, never bake or distribute Claude Code, install Claude from Anthropic at runtime) and the API-key vs self-host OAuth auth paths. - Record the recipe-not-image posture on the Daytona snapshot builder docstring and the docker-compose comment (we ship the build recipe, not a Claude-containing image).
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Reviewer guide: interesting codeStart with the README. It is the rule the other files implement.
|
| # the daemon's `install-agent claude` fails TLS verification. git lets npm/installers | ||
| # fetch git deps. | ||
| RUN apt-get update \ | ||
| && apt-get install -y --no-install-recommends ca-certificates git \ |
There was a problem hiding this comment.
Load-bearing: ca-certificates is here because the daemon fetches Claude from Anthropic over TLS at runtime. node:*-slim omits the trust store, so without this install-agent claude fails verification. This is the mechanism that keeps Anthropic as the distributor and the image Claude-free.
| # Install deps as a cached layer (manifest + lockfile only). The full dependency set is | ||
| # installed (not --prod): the runtime uses `tsx` and the extension build uses `esbuild`, | ||
| # both devDependencies. | ||
| COPY package.json pnpm-lock.yaml ./ |
There was a problem hiding this comment.
Full dependency set is installed (not --prod) on purpose: the runtime uses tsx and the extension build uses esbuild, both devDependencies. Worth confirming this stays intentional if anyone later tries to slim the image.
| its own for internal use; self-hosters build their own. We never hand anyone a | ||
| Claude-containing image, so this is compliant even though the `-full` base bundles | ||
| Claude (Anthropic's Commercial Terms forbid us *distributing* Claude Code, not | ||
| building/using it). |
There was a problem hiding this comment.
This is the subtle compliance case. The -full base bundles Claude, so the built snapshot contains Claude. It stays compliant only because we ship this script and each operator builds the snapshot in their own account; we distribute nothing. The cleaner-provenance follow-up (daemon-only base + install from Anthropic at build) is parked pending a live Daytona check that the daemon-only tag ships the ACP adapters.
There was a problem hiding this comment.
Actionable comments posted: 1
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 9aa9244b-be9b-4a05-beb3-8a797c78fe34
📒 Files selected for processing (4)
docs/design/agent-workflows/scratch/wp-8-rivet-acp-runtime/poc/build_rivet_snapshot.pyhosting/docker-compose/ee/docker-compose.dev.ymlservices/agent/docker/Dockerfileservices/agent/docker/README.md
| FROM node:24-slim | ||
|
|
||
| WORKDIR /app | ||
|
|
||
| # CA certificates: the sandbox-agent daemon (Rust) downloads harness CLIs (e.g. Claude | ||
| # Code) over HTTPS using the system trust store, which node:*-slim omits — without this | ||
| # the daemon's `install-agent claude` fails TLS verification. git lets npm/installers | ||
| # fetch git deps. | ||
| RUN apt-get update \ | ||
| && apt-get install -y --no-install-recommends ca-certificates git \ | ||
| && rm -rf /var/lib/apt/lists/* | ||
|
|
||
| RUN corepack enable | ||
|
|
||
| # Install deps as a cached layer (manifest + lockfile only). The full dependency set is | ||
| # installed (not --prod): the runtime uses `tsx` and the extension build uses `esbuild`, | ||
| # both devDependencies. | ||
| COPY package.json pnpm-lock.yaml ./ | ||
| RUN pnpm install --frozen-lockfile | ||
|
|
||
| # Bake the source (no bind mount in production). | ||
| COPY tsconfig.json ./ | ||
| COPY scripts ./scripts | ||
| COPY src ./src | ||
| COPY config ./config | ||
| COPY skills ./skills | ||
|
|
||
| # Bundle the Agenta Pi extension (tracing + tools) into dist/. runSandboxAgent installs | ||
| # this baked copy into Pi's agent dir on every run. Rebuild the image after editing | ||
| # src/extensions/agenta.ts or the tracer. | ||
| RUN pnpm run build:extension | ||
|
|
||
| ENV NODE_ENV=production \ | ||
| PORT=8765 | ||
|
|
||
| EXPOSE 8765 | ||
|
|
||
| # Call the local tsx binary directly to avoid pnpm/corepack HOME writes when the | ||
| # container runs as a non-root host uid. | ||
| CMD ["node_modules/.bin/tsx", "src/server.ts"] |
There was a problem hiding this comment.
Run the final container as a non-root user.
The image still launches the HTTP server as root. For a network-facing agent runner, that widens the blast radius of any compromise.
🔧 Suggested fix
RUN pnpm run build:extension
ENV NODE_ENV=production \
PORT=8765
EXPOSE 8765
+USER node
CMD ["node_modules/.bin/tsx", "src/server.ts"]Source: Linters/SAST tools
|
Superseded. Replacing the path-based stack with PRs sliced by functional area showing final code only, so reviewers don't comment on intermediate scaffolding that a later PR rewrites. See the new set. |
This PR is part of a stack. Review bottom-up.
Each PR's diff is only its own delta. Merge from the bottom. This PR's base is #4761 (merge that first).
Context
The agent runner sidecar (the
sandbox-agent serverruntime inservices/agent/src/server.ts) had a dev Dockerfile but no production image and no written rules for what an image may contain. This PR adds the production image and the licensing posture that governs every agent image. It branches offfeat/agent-harness-portas the infra-chore side slice indocs/design/agent-workflows/pr-stack.md.What this changes
This PR adds three things and clarifies one comment:
Dockerfilefor the sidecar. It bakes the TypeScript runner source, runspnpm run build:extension, and serves on:8765withtsx. Notsx watchand no bind mount, unlike the dev image.README.mdthat states the rule for agent images: we ship build recipes, never Claude-containing images, and never a baked credential.-fullbase, not baked by us.No binary, snapshot, or Claude artifact is committed. The repo ships the builder script, so each operator builds their own image in their own account.
Key architectural decision to review
The decision to scrutinize is the licensing posture in
services/agent/docker/README.mdand the Dockerfile header. Claude Code is proprietary under Anthropic's Commercial Terms, which grant a usage license but no right to redistribute. Pi is MIT, so we bake it freely.The tradeoff: we never produce an image that contains Claude Code. The production Dockerfile installs nothing from Anthropic at build time. Instead the sandbox-agent daemon installs Claude at runtime from Anthropic over HTTPS, which is why
ca-certificatesis installed (Dockerfile:25). That keeps Anthropic as the distributor. The Daytona snapshot is the one place this gets subtle: today the builder bases on rivet's-fullimage, which already bundles Claude. That stays compliant only because we ship the builder script, not the built snapshot, so each operator builds the Claude-containing image in their own account and we distribute nothing. The README records a cleaner-provenance follow-up: base on a daemon-only rivet image and install Claude from Anthropic at build, pending a live Daytona check that the daemon-only tag ships the ACP adapters.Verify the rule holds end to end: nothing in this PR commits a Claude binary or a prebuilt snapshot, and no image baked here contains Claude or a credential.
How to review this PR
services/agent/docker/README.mdfirst. It is the contract the other files implement.services/agent/docker/Dockerfile. Confirm it bakes only Pi (via npm deps) and source, installs no Anthropic package, and bakes no key or login. Check thatca-certificatesexists because the daemon fetches Claude over TLS at runtime.build_rivet_snapshot.pydocstring. These are doc-only and restate the same posture.Tests / notes
Doc and Dockerfile only, no code paths change. The production image is not built in CI by this PR. The cleaner-provenance snapshot follow-up is parked because it needs a live Daytona build to confirm the daemon-only rivet tag ships the ACP adapters.