Skip to content

Commit 19e7bed

Browse files
committed
docs: add Vertex AI sandbox usage for Claude Code and OpenCode
Cover the full end-to-end setup for running Claude Code and OpenCode inside an OpenShell sandbox via inference.local with a Vertex AI backend: - google-vertex-ai.mdx: add 'Use from a Sandbox' section with tabbed examples for Claude Code (--bare flag, no /v1 suffix) and OpenCode (/v1 suffix required). Add providers_v2_enabled prerequisite and --no-verify note for global region. Document policy proposals table covering metadata.google.internal (always blocked), downloads.claude.ai, and storage.googleapis.com. - inference-routing.mdx: expand 'Use the Local Endpoint' section with tabbed examples for Claude Code, OpenCode, Python OpenAI SDK, and Python Anthropic SDK. Add notes explaining the /v1 path suffix difference between clients. - supported-agents.mdx: update Claude Code and OpenCode rows to mention inference.local support and correct base URL requirements.
1 parent fe3b147 commit 19e7bed

3 files changed

Lines changed: 130 additions & 7 deletions

File tree

docs/about/supported-agents.mdx

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,8 +10,8 @@ The following table summarizes the agents that run in OpenShell sandboxes. Most
1010

1111
| Agent | Source | Default Policy | Notes |
1212
|---|---|---|---|
13-
| [Claude Code](https://docs.anthropic.com/en/docs/claude-code) | [`base`](https://github.com/NVIDIA/OpenShell-Community/tree/main/sandboxes/base) | Full coverage | Works out of the box. Requires `ANTHROPIC_API_KEY`. |
14-
| [OpenCode](https://opencode.ai/) | [`base`](https://github.com/NVIDIA/OpenShell-Community/tree/main/sandboxes/base) | Partial coverage | Pre-installed. Add `opencode.ai` endpoint and OpenCode binary paths to the policy for full functionality. |
13+
| [Claude Code](https://docs.anthropic.com/en/docs/claude-code) | [`base`](https://github.com/NVIDIA/OpenShell-Community/tree/main/sandboxes/base) | Full coverage | Works out of the box. Requires `ANTHROPIC_API_KEY` for direct Anthropic access, or use `inference.local` with a configured provider (e.g. Vertex AI). |
14+
| [OpenCode](https://opencode.ai/) | [`base`](https://github.com/NVIDIA/OpenShell-Community/tree/main/sandboxes/base) | Partial coverage | Pre-installed. Use `ANTHROPIC_BASE_URL="https://inference.local/v1"` with a configured provider. Add `opencode.ai` endpoint and OpenCode binary paths to the policy for full functionality. |
1515
| [Codex](https://developers.openai.com/codex) | [`base`](https://github.com/NVIDIA/OpenShell-Community/tree/main/sandboxes/base) | No coverage | Pre-installed. Requires a custom policy with OpenAI endpoints and Codex binary paths. Requires `OPENAI_API_KEY`. |
1616
| [GitHub Copilot CLI](https://docs.github.com/en/copilot/github-copilot-in-the-cli) | [`base`](https://github.com/NVIDIA/OpenShell-Community/tree/main/sandboxes/base) | Full coverage | Pre-installed. Works out of the box. Requires `GITHUB_TOKEN` or `COPILOT_GITHUB_TOKEN`. |
1717
| [OpenClaw](https://openclaw.ai/) | [NemoClaw](https://github.com/NVIDIA/NemoClaw) | Blueprint-managed | Run OpenClaw more securely inside NVIDIA OpenShell with managed inference using NemoClaw. |

docs/providers/google-vertex-ai.mdx

Lines changed: 79 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -105,16 +105,93 @@ OpenShell exposes Anthropic Vertex routes for inference only. It does not advert
105105

106106
## Configure Inference Routing
107107

108-
After creating the provider, point `inference.local` at it:
108+
Before configuring inference routing, enable provider endpoint injection so the Vertex AI network endpoints are automatically included in sandbox policies:
109+
110+
```shell
111+
openshell settings set --global --key providers_v2_enabled --value true --yes
112+
```
113+
114+
Then point `inference.local` at the provider:
109115

110116
```shell
111117
openshell inference set \
112118
--provider vertex-prod \
113-
--model claude-3-5-sonnet@20241022
119+
--model claude-sonnet-4-6
120+
```
121+
122+
Use `--no-verify` if the endpoint verification fails. This is common with the `global` region, where the validation probe may not match the actual rawPredict path:
123+
124+
```shell
125+
openshell inference set \
126+
--provider vertex-prod \
127+
--model claude-sonnet-4-6 \
128+
--no-verify
114129
```
115130

116131
Sandboxes on that gateway reach the model at `https://inference.local`. For full details on inference routing, refer to [Inference Routing](/sandboxes/inference-routing).
117132

133+
## Use from a Sandbox
134+
135+
Agents inside sandboxes should reach Vertex AI through `inference.local`, not by connecting to Vertex AI directly. The gateway manages GCP credential refresh and request translation; the agent only needs to point its SDK at the local endpoint.
136+
137+
The complete setup from scratch:
138+
139+
```shell
140+
# 1. Enable provider endpoint injection
141+
openshell settings set --global --key providers_v2_enabled --value true --yes
142+
143+
# 2. Create the provider
144+
openshell provider create \
145+
--name vertex-local \
146+
--type google-vertex-ai \
147+
--from-gcloud-adc \
148+
--config VERTEX_AI_PROJECT_ID=my-gcp-project \
149+
--config VERTEX_AI_REGION=us-central1
150+
151+
# 3. Configure inference routing
152+
openshell inference set --provider vertex-local --model claude-sonnet-4-6 --no-verify
153+
154+
# 4. Create a sandbox with the provider attached
155+
openshell sandbox create --name my-sandbox --provider vertex-local
156+
```
157+
158+
Then inside the sandbox, launch the agent as shown below.
159+
160+
<Tabs>
161+
<Tab title="Claude Code">
162+
163+
```shell
164+
ANTHROPIC_BASE_URL="https://inference.local" ANTHROPIC_API_KEY=unused claude --bare
165+
```
166+
167+
`--bare` skips the OAuth login flow and uses `ANTHROPIC_API_KEY` directly for authentication. The key value does not reach Vertex AI — `inference.local` strips it and injects the real GCP access token before forwarding.
168+
169+
<Warning>
170+
Do not set `CLAUDE_CODE_USE_VERTEX=1` inside the sandbox. That flag makes Claude Code connect directly to Vertex AI and attempt GCP credential discovery (ADC file, metadata service), which fails because the sandbox does not expose GCP credentials. Use `inference.local` instead.
171+
</Warning>
172+
173+
</Tab>
174+
<Tab title="OpenCode">
175+
176+
```shell
177+
ANTHROPIC_BASE_URL="https://inference.local/v1" ANTHROPIC_API_KEY=unused opencode
178+
```
179+
180+
OpenCode requires `/v1` in the base URL. Without it, OpenCode sends `POST /messages` instead of `POST /v1/messages`, which does not match the inference pattern and is denied.
181+
182+
</Tab>
183+
</Tabs>
184+
185+
### Policy Proposals
186+
187+
After running an agent, the TUI (`openshell term`) may show policy proposals for denied endpoints. Common ones for Vertex AI sandboxes:
188+
189+
| Endpoint | Action | Reason |
190+
|---|---|---|
191+
| `metadata.google.internal:80` | **Reject** | Resolves to `169.254.169.254` (GCE metadata service). Always blocked regardless of policy — the proxy blocks the resolved IP unconditionally to prevent credential exfiltration. |
192+
| `downloads.claude.ai:443` | Approve if desired | Claude Code update checking and asset loading. Not required for inference. |
193+
| `storage.googleapis.com:443` | Approve if desired | Google Cloud Storage. Used by some Claude Code features. Not required for inference. |
194+
118195
## From Existing Environment
119196

120197
If one of these token env vars is already set in your shell, create the provider with `--from-existing`:

docs/sandboxes/inference-routing.mdx

Lines changed: 49 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -195,7 +195,34 @@ openshell inference update --timeout 120
195195

196196
## Use the Local Endpoint from a Sandbox
197197

198-
After inference is configured, code inside any sandbox can call `https://inference.local` directly:
198+
After inference is configured, code inside any sandbox can call `https://inference.local` directly. The client-supplied `model` and `api_key` values are not sent upstream — the privacy router injects the real credentials from the configured provider and rewrites the model before forwarding. Some SDKs require a non-empty API key even though `inference.local` does not use the sandbox-provided value; pass any placeholder such as `unused`.
199+
200+
<Tabs>
201+
<Tab title="Claude Code">
202+
203+
```shell
204+
ANTHROPIC_BASE_URL="https://inference.local" ANTHROPIC_API_KEY=unused claude --bare
205+
```
206+
207+
`--bare` skips the OAuth login flow and uses `ANTHROPIC_API_KEY` directly. The key is stripped by the proxy and never reaches the upstream provider.
208+
209+
<Note>
210+
Claude Code appends `/v1/messages` to `ANTHROPIC_BASE_URL`, so omit the `/v1` suffix from the base URL.
211+
</Note>
212+
213+
</Tab>
214+
<Tab title="OpenCode">
215+
216+
```shell
217+
ANTHROPIC_BASE_URL="https://inference.local/v1" ANTHROPIC_API_KEY=unused opencode
218+
```
219+
220+
<Note>
221+
OpenCode appends `/messages` directly to `ANTHROPIC_BASE_URL`. Include the `/v1` suffix so the full path becomes `/v1/messages`, which matches the inference pattern.
222+
</Note>
223+
224+
</Tab>
225+
<Tab title="Python (OpenAI SDK)">
199226

200227
```python
201228
from openai import OpenAI
@@ -208,9 +235,28 @@ response = client.chat.completions.create(
208235
)
209236
```
210237

211-
The client-supplied `model` and `api_key` values are not sent upstream. The privacy router injects the real credentials from the configured provider and rewrites the model before forwarding. Some SDKs require a non-empty API key even though `inference.local` does not use the sandbox-provided value. In those cases, pass any placeholder such as `test` or `unused`.
238+
</Tab>
239+
<Tab title="Python (Anthropic SDK)">
240+
241+
```python
242+
import anthropic
243+
244+
client = anthropic.Anthropic(
245+
base_url="https://inference.local",
246+
api_key="unused",
247+
)
248+
249+
message = client.messages.create(
250+
model="anything",
251+
max_tokens=1024,
252+
messages=[{"role": "user", "content": "Hello"}],
253+
)
254+
```
255+
256+
</Tab>
257+
</Tabs>
212258

213-
Use this endpoint when inference should stay local to the host for privacy and security reasons. External providers that should be reached directly belong in `network_policies` instead.
259+
Use `inference.local` when inference should stay private and credentials should not be exposed inside the sandbox. External providers reached directly belong in `network_policies` instead.
214260

215261
When the upstream runs on the same machine as the gateway, bind it to `0.0.0.0` and point the provider at `host.openshell.internal` or the host's LAN IP. `127.0.0.1` and `localhost` usually fail because the request originates from the gateway or sandbox runtime, not from your shell.
216262

0 commit comments

Comments
 (0)