docs: add Vertex AI sandbox usage for Claude Code and OpenCode

maxamillion · maxamillion · commit 19e7bed23efd · 2026-05-28T12:29:30.000-05:00
Cover the full end-to-end setup for running Claude Code and OpenCode
inside an OpenShell sandbox via inference.local with a Vertex AI backend:

- google-vertex-ai.mdx: add 'Use from a Sandbox' section with tabbed
  examples for Claude Code (--bare flag, no /v1 suffix) and OpenCode
  (/v1 suffix required). Add providers_v2_enabled prerequisite and
  --no-verify note for global region. Document policy proposals table
  covering metadata.google.internal (always blocked), downloads.claude.ai,
  and storage.googleapis.com.

- inference-routing.mdx: expand 'Use the Local Endpoint' section with
  tabbed examples for Claude Code, OpenCode, Python OpenAI SDK, and
  Python Anthropic SDK. Add notes explaining the /v1 path suffix
  difference between clients.

- supported-agents.mdx: update Claude Code and OpenCode rows to mention
  inference.local support and correct base URL requirements.
diff --git a/docs/about/supported-agents.mdx b/docs/about/supported-agents.mdx
@@ -10,8 +10,8 @@ The following table summarizes the agents that run in OpenShell sandboxes. Most
 
 | Agent | Source | Default Policy | Notes |
 |---|---|---|---|
-| [Claude Code](https://docs.anthropic.com/en/docs/claude-code) | [`base`](https://github.com/NVIDIA/OpenShell-Community/tree/main/sandboxes/base) | Full coverage | Works out of the box. Requires `ANTHROPIC_API_KEY`. |
-| [OpenCode](https://opencode.ai/) | [`base`](https://github.com/NVIDIA/OpenShell-Community/tree/main/sandboxes/base) | Partial coverage | Pre-installed. Add `opencode.ai` endpoint and OpenCode binary paths to the policy for full functionality. |
+| [Claude Code](https://docs.anthropic.com/en/docs/claude-code) | [`base`](https://github.com/NVIDIA/OpenShell-Community/tree/main/sandboxes/base) | Full coverage | Works out of the box. Requires `ANTHROPIC_API_KEY` for direct Anthropic access, or use `inference.local` with a configured provider (e.g. Vertex AI). |
+| [OpenCode](https://opencode.ai/) | [`base`](https://github.com/NVIDIA/OpenShell-Community/tree/main/sandboxes/base) | Partial coverage | Pre-installed. Use `ANTHROPIC_BASE_URL="https://inference.local/v1"` with a configured provider. Add `opencode.ai` endpoint and OpenCode binary paths to the policy for full functionality. |
 | [Codex](https://developers.openai.com/codex) | [`base`](https://github.com/NVIDIA/OpenShell-Community/tree/main/sandboxes/base) | No coverage | Pre-installed. Requires a custom policy with OpenAI endpoints and Codex binary paths. Requires `OPENAI_API_KEY`. |
 | [GitHub Copilot CLI](https://docs.github.com/en/copilot/github-copilot-in-the-cli) | [`base`](https://github.com/NVIDIA/OpenShell-Community/tree/main/sandboxes/base) | Full coverage | Pre-installed. Works out of the box. Requires `GITHUB_TOKEN` or `COPILOT_GITHUB_TOKEN`. |
 | [OpenClaw](https://openclaw.ai/) | [NemoClaw](https://github.com/NVIDIA/NemoClaw) | Blueprint-managed | Run OpenClaw more securely inside NVIDIA OpenShell with managed inference using NemoClaw. |
diff --git a/docs/providers/google-vertex-ai.mdx b/docs/providers/google-vertex-ai.mdx
@@ -105,16 +105,93 @@ OpenShell exposes Anthropic Vertex routes for inference only. It does not advert
 
 ## Configure Inference Routing
 
-After creating the provider, point `inference.local` at it:
+Before configuring inference routing, enable provider endpoint injection so the Vertex AI network endpoints are automatically included in sandbox policies:
+
+```shell
+openshell settings set --global --key providers_v2_enabled --value true --yes
+```
+
+Then point `inference.local` at the provider:
 
 ```shell
 openshell inference set \
   --provider vertex-prod \
-  --model claude-3-5-sonnet@20241022
+  --model claude-sonnet-4-6
+```
+
+Use `--no-verify` if the endpoint verification fails. This is common with the `global` region, where the validation probe may not match the actual rawPredict path:
+
+```shell
+openshell inference set \
+  --provider vertex-prod \
+  --model claude-sonnet-4-6 \
+  --no-verify
 ```
 
 Sandboxes on that gateway reach the model at `https://inference.local`. For full details on inference routing, refer to [Inference Routing](/sandboxes/inference-routing).
 
+## Use from a Sandbox
+
+Agents inside sandboxes should reach Vertex AI through `inference.local`, not by connecting to Vertex AI directly. The gateway manages GCP credential refresh and request translation; the agent only needs to point its SDK at the local endpoint.
+
+The complete setup from scratch:
+
+```shell
+# 1. Enable provider endpoint injection
+openshell settings set --global --key providers_v2_enabled --value true --yes
+
+# 2. Create the provider
+openshell provider create \
+  --name vertex-local \
+  --type google-vertex-ai \
+  --from-gcloud-adc \
+  --config VERTEX_AI_PROJECT_ID=my-gcp-project \
+  --config VERTEX_AI_REGION=us-central1
+
+# 3. Configure inference routing
+openshell inference set --provider vertex-local --model claude-sonnet-4-6 --no-verify
+
+# 4. Create a sandbox with the provider attached
+openshell sandbox create --name my-sandbox --provider vertex-local
+```
+
+Then inside the sandbox, launch the agent as shown below.
+
+<Tabs>
+<Tab title="Claude Code">
+
+```shell
+ANTHROPIC_BASE_URL="https://inference.local" ANTHROPIC_API_KEY=unused claude --bare
+```
+
+`--bare` skips the OAuth login flow and uses `ANTHROPIC_API_KEY` directly for authentication. The key value does not reach Vertex AI — `inference.local` strips it and injects the real GCP access token before forwarding.
+
+<Warning>
+Do not set `CLAUDE_CODE_USE_VERTEX=1` inside the sandbox. That flag makes Claude Code connect directly to Vertex AI and attempt GCP credential discovery (ADC file, metadata service), which fails because the sandbox does not expose GCP credentials. Use `inference.local` instead.
+</Warning>
+
+</Tab>
+<Tab title="OpenCode">
+
+```shell
+ANTHROPIC_BASE_URL="https://inference.local/v1" ANTHROPIC_API_KEY=unused opencode
+```
+
+OpenCode requires `/v1` in the base URL. Without it, OpenCode sends `POST /messages` instead of `POST /v1/messages`, which does not match the inference pattern and is denied.
+
+</Tab>
+</Tabs>
+
+### Policy Proposals
+
+After running an agent, the TUI (`openshell term`) may show policy proposals for denied endpoints. Common ones for Vertex AI sandboxes:
+
+| Endpoint | Action | Reason |
+|---|---|---|
+| `metadata.google.internal:80` | **Reject** | Resolves to `169.254.169.254` (GCE metadata service). Always blocked regardless of policy — the proxy blocks the resolved IP unconditionally to prevent credential exfiltration. |
+| `downloads.claude.ai:443` | Approve if desired | Claude Code update checking and asset loading. Not required for inference. |
+| `storage.googleapis.com:443` | Approve if desired | Google Cloud Storage. Used by some Claude Code features. Not required for inference. |
+
 ## From Existing Environment
 
 If one of these token env vars is already set in your shell, create the provider with `--from-existing`:
diff --git a/docs/sandboxes/inference-routing.mdx b/docs/sandboxes/inference-routing.mdx
@@ -195,7 +195,34 @@ openshell inference update --timeout 120
 
 ## Use the Local Endpoint from a Sandbox
 
-After inference is configured, code inside any sandbox can call `https://inference.local` directly:
+After inference is configured, code inside any sandbox can call `https://inference.local` directly. The client-supplied `model` and `api_key` values are not sent upstream — the privacy router injects the real credentials from the configured provider and rewrites the model before forwarding. Some SDKs require a non-empty API key even though `inference.local` does not use the sandbox-provided value; pass any placeholder such as `unused`.
+
+<Tabs>
+<Tab title="Claude Code">
+
+```shell
+ANTHROPIC_BASE_URL="https://inference.local" ANTHROPIC_API_KEY=unused claude --bare
+```
+
+`--bare` skips the OAuth login flow and uses `ANTHROPIC_API_KEY` directly. The key is stripped by the proxy and never reaches the upstream provider.
+
+<Note>
+Claude Code appends `/v1/messages` to `ANTHROPIC_BASE_URL`, so omit the `/v1` suffix from the base URL.
+</Note>
+
+</Tab>
+<Tab title="OpenCode">
+
+```shell
+ANTHROPIC_BASE_URL="https://inference.local/v1" ANTHROPIC_API_KEY=unused opencode
+```
+
+<Note>
+OpenCode appends `/messages` directly to `ANTHROPIC_BASE_URL`. Include the `/v1` suffix so the full path becomes `/v1/messages`, which matches the inference pattern.
+</Note>
+
+</Tab>
+<Tab title="Python (OpenAI SDK)">
 
 ```python
 from openai import OpenAI
@@ -208,9 +235,28 @@ response = client.chat.completions.create(
 )
 ```
 
-The client-supplied `model` and `api_key` values are not sent upstream. The privacy router injects the real credentials from the configured provider and rewrites the model before forwarding. Some SDKs require a non-empty API key even though `inference.local` does not use the sandbox-provided value. In those cases, pass any placeholder such as `test` or `unused`.
+</Tab>
+<Tab title="Python (Anthropic SDK)">
+
+```python
+import anthropic
+
+client = anthropic.Anthropic(
+    base_url="https://inference.local",
+    api_key="unused",
+)
+
+message = client.messages.create(
+    model="anything",
+    max_tokens=1024,
+    messages=[{"role": "user", "content": "Hello"}],
+)
+```
+
+</Tab>
+</Tabs>
 
-Use this endpoint when inference should stay local to the host for privacy and security reasons. External providers that should be reached directly belong in `network_policies` instead.
+Use `inference.local` when inference should stay private and credentials should not be exposed inside the sandbox. External providers reached directly belong in `network_policies` instead.
 
 When the upstream runs on the same machine as the gateway, bind it to `0.0.0.0` and point the provider at `host.openshell.internal` or the host's LAN IP. `127.0.0.1` and `localhost` usually fail because the request originates from the gateway or sandbox runtime, not from your shell.