diff --git a/docs/_data/nav.yml b/docs/_data/nav.yml index 4390e1b21..3c5b094ee 100644 --- a/docs/_data/nav.yml +++ b/docs/_data/nav.yml @@ -94,6 +94,8 @@ url: /features/acp/ - title: API Server url: /features/api-server/ + - title: Chat Server + url: /features/chat-server/ - title: Evaluation url: /features/evaluation/ - title: RAG diff --git a/docs/configuration/models/index.md b/docs/configuration/models/index.md index a86b915e7..77669060a 100644 --- a/docs/configuration/models/index.md +++ b/docs/configuration/models/index.md @@ -40,7 +40,7 @@ models: | Property | Type | Required | Description | | --------------------- | ---------- | -------- | ------------------------------------------------------------------------------------- | | `provider` | string | ✓ | Provider: `openai`, `anthropic`, `google`, `amazon-bedrock`, `dmr`, `mistral`, `xai`, `nebius`, `minimax`, `requesty`, `azure`, `ollama`, `github-copilot`, or any [named provider]({{ '/providers/custom/' | relative_url }}). | -| `model` | string | ✓ | Model name (e.g., `gpt-4o`, `claude-sonnet-4-0`, `gemini-2.5-flash`) | +| `model` | string | ✓ | Model name (e.g., `gpt-4o`, `claude-sonnet-4-5`, `gemini-2.5-flash`) | | `temperature` | float | ✗ | Sampling randomness. Range is provider-dependent — typically `0.0–2.0` (Anthropic caps at `1.0`). `0.0` is deterministic. | | `max_tokens` | int | ✗ | Maximum response length in tokens | | `top_p` | float | ✗ | Nucleus sampling threshold (`0.0–1.0`) | @@ -232,7 +232,7 @@ models: # Anthropic claude: provider: anthropic - model: claude-sonnet-4-0 + model: claude-sonnet-4-5 max_tokens: 64000 # Google Gemini diff --git a/docs/configuration/structured-output/index.md b/docs/configuration/structured-output/index.md index f3208bd5d..24fc500c8 100644 --- a/docs/configuration/structured-output/index.md +++ b/docs/configuration/structured-output/index.md @@ -192,7 +192,7 @@ agents: ```yaml agents: classifier: - model: anthropic/claude-sonnet-4-0 + model: anthropic/claude-sonnet-4-5 description: Classify support tickets instruction: | Classify the support ticket into the appropriate category diff --git a/docs/configuration/tools/index.md b/docs/configuration/tools/index.md index 261ce26b2..c1990bfea 100644 --- a/docs/configuration/tools/index.md +++ b/docs/configuration/tools/index.md @@ -395,7 +395,7 @@ toolsets: ```yaml agents: root: - model: anthropic/claude-sonnet-4-0 + model: anthropic/claude-sonnet-4-5 description: Full-featured developer assistant instruction: You are an expert developer. toolsets: diff --git a/docs/features/api-server/index.md b/docs/features/api-server/index.md index be3e54fdb..aead43afb 100644 --- a/docs/features/api-server/index.md +++ b/docs/features/api-server/index.md @@ -190,6 +190,6 @@ Toggle auto-approve with `POST /api/sessions/:id/tools/toggle` for automated wor
ℹ️ See also
-

For interactive use, see the Terminal UI. For agent-to-agent communication, see A2A Protocol and ACP. For MCP integration, see MCP Mode.

+

For interactive use, see the Terminal UI. For agent-to-agent communication, see A2A Protocol and ACP. For MCP integration, see MCP Mode. For an OpenAI-compatible chat-completions API, see the Chat Server.

diff --git a/docs/features/chat-server/index.md b/docs/features/chat-server/index.md new file mode 100644 index 000000000..6cb1fa06d --- /dev/null +++ b/docs/features/chat-server/index.md @@ -0,0 +1,230 @@ +--- +title: "Chat Server" +description: "Expose your agents through an OpenAI-compatible Chat Completions API so any tool that already speaks OpenAI can drive a docker-agent agent." +permalink: /features/chat-server/ +--- + +# Chat Server + +_Expose your agents through an OpenAI-compatible Chat Completions API so any tool that already speaks OpenAI can drive a docker-agent agent._ + +## Overview + +The `docker agent serve chat` command starts an HTTP server that exposes one or +more agents through an **OpenAI-compatible Chat Completions API** at +`/v1/chat/completions` and `/v1/models`. Any client that already speaks the +OpenAI protocol — for example +[Open WebUI](https://github.com/open-webui/open-webui), `curl`, the OpenAI +Python SDK, or LangChain — can drive a docker-agent agent without any custom +integration. + +```bash +# Single agent — exposed as the model `root` +$ docker agent serve chat agent.yaml + +# Multi-agent config — every agent in the team becomes a model +$ docker agent serve chat ./team.yaml + +# Pick a specific agent from a multi-agent config +$ docker agent serve chat ./team.yaml --agent reviewer + +# Run an agent straight from the registry +$ docker agent serve chat agentcatalog/pirate --listen 127.0.0.1:9090 + +# Require a Bearer token, sourced from an env var +$ docker agent serve chat agent.yaml --api-key-env CHAT_BEARER_TOKEN +``` + +
+
💡 When to use chat server vs. API server +
+

Use the chat server when you want to plug docker-agent into existing OpenAI-compatible tooling (chat UIs, IDE integrations, OpenAI SDK clients). Use the API server when you want full control over sessions, agent execution, tool-call confirmations, and streamed runtime events.

+ +
+ +## Endpoints + +The OpenAI-compatible endpoints live under the `/v1` prefix to match the +OpenAI API surface. The OpenAPI specification is served at the top level so it +can be discovered without authentication. + +| Method | Path | Description | +| ------ | ---------------------- | ---------------------------------------------------------------------- | +| `GET` | `/v1/models` | List the agents that this server exposes as models | +| `POST` | `/v1/chat/completions` | Send messages and receive a completion (regular or streaming) | +| `GET` | `/openapi.json` | OpenAPI specification for the chat server | + +The model identifier in `POST /v1/chat/completions` is the **agent name**. +For a single-agent config that's typically `root`; for a multi-agent config, +each named agent becomes its own selectable model. + +## Quick Start + +```bash +# 1. Start the server +$ docker agent serve chat agent.yaml +Listening on 127.0.0.1:8083 +OpenAI-compatible chat completions endpoint: http://127.0.0.1:8083/v1/chat/completions + +# 2. List exposed agents (models) +$ curl http://127.0.0.1:8083/v1/models +{"object":"list","data":[{"id":"root","object":"model","owned_by":"docker-agent"}]} + +# 3. Send a chat request +$ curl http://127.0.0.1:8083/v1/chat/completions \ + -H 'Content-Type: application/json' \ + -d '{ + "model": "root", + "messages": [{"role": "user", "content": "Hello!"}] + }' +``` + +### Streaming + +Set `"stream": true` in the request body to receive a Server-Sent Events +(SSE) stream of OpenAI-format `chat.completion.chunk` deltas: + +```bash +$ curl -N http://127.0.0.1:8083/v1/chat/completions \ + -H 'Content-Type: application/json' \ + -d '{ + "model": "root", + "stream": true, + "messages": [{"role": "user", "content": "Stream a poem"}] + }' +``` + +### Drive it from the OpenAI Python SDK + +Because the wire format is OpenAI-compatible, point any OpenAI client at the +chat server's `base_url` and use the agent name as the model: + +```python +from openai import OpenAI + +client = OpenAI( + base_url="http://127.0.0.1:8083/v1", + api_key="not-needed-when-no-api-key-flag", # required by the SDK, ignored if no auth +) + +resp = client.chat.completions.create( + model="root", + messages=[{"role": "user", "content": "Hello!"}], +) +print(resp.choices[0].message.content) +``` + +## Server-side Conversation Caching + +By default the server is **stateless**: every request must contain the full +message history, exactly like OpenAI's API. Enable server-side caching by +setting `--conversations-max` to a positive value, then send a stable +`X-Conversation-Id` header on each request: + +```bash +$ docker agent serve chat agent.yaml --conversations-max 100 --conversation-ttl 30m +``` + +```bash +$ curl http://127.0.0.1:8083/v1/chat/completions \ + -H 'Content-Type: application/json' \ + -H 'X-Conversation-Id: my-thread-1' \ + -d '{ + "model": "root", + "messages": [{"role": "user", "content": "Remember my name is Alice"}] + }' + +$ curl http://127.0.0.1:8083/v1/chat/completions \ + -H 'Content-Type: application/json' \ + -H 'X-Conversation-Id: my-thread-1' \ + -d '{ + "model": "root", + "messages": [{"role": "user", "content": "What is my name?"}] + }' +``` + +Cached conversations are evicted after `--conversation-ttl` of inactivity, or +when the cache hits `--conversations-max` items (oldest entries are evicted +first). + +## Authentication + +The chat server has **no authentication by default**. To require a Bearer +token, pass `--api-key` (literal value) or `--api-key-env` (name of an +environment variable that holds the value): + +```bash +$ docker agent serve chat agent.yaml --api-key-env CHAT_BEARER_TOKEN +``` + +Clients must then send an `Authorization: Bearer ` header on every +request to `/v1/*`. Both `/v1/models` and `/v1/chat/completions` are +protected once a key is set. + +
+
⚠️ Public exposure +
+

The default listen address is 127.0.0.1:8083. If you bind to a non-loopback address, always set --api-key or --api-key-env — there is no other authentication layer.

+ +
+ +## CORS + +CORS is **disabled by default**. To allow a browser-based client to call the +server, set `--cors-origin` to the exact origin (scheme + host + port) that +should be allowed: + +```bash +$ docker agent serve chat agent.yaml --cors-origin https://my-ui.example.com +``` + +## CLI Flags + +```bash +docker agent serve chat | [flags] +``` + +| Flag | Default | Description | +| ----------------------------- | ------------------ | ----------------------------------------------------------------------------------------------------------------- | +| `-a, --agent ` | (all agents) | Name of the agent to expose. If omitted, every agent in the config is exposed as a separate model. | +| `-l, --listen ` | `127.0.0.1:8083` | Address to listen on. | +| `--cors-origin ` | (none) | Allowed CORS origin (e.g. `https://example.com`). Empty disables CORS. | +| `--api-key ` | (none) | Required Bearer token clients must present (`Authorization: Bearer `). Empty disables auth. | +| `--api-key-env ` | (none) | Read the API key from this environment variable instead of the command line. | +| `--max-request-size ` | `1048576` (1 MiB) | Maximum request body size. | +| `--request-timeout ` | `5m` | Per-request timeout (covers model + tool calls + streaming). | +| `--conversations-max ` | `0` | Cache up to N conversations server-side, keyed by `X-Conversation-Id`. `0` disables — clients must resend history. | +| `--conversation-ttl ` | `30m` | Idle TTL after which a cached conversation is evicted. | +| `--max-idle-runtimes ` | `4` | Maximum number of idle runtimes pooled per agent. `0` disables pooling. | + +All [runtime configuration flags]({{ '/features/cli/#runtime-configuration-flags' | relative_url }}) +(`--working-dir`, `--env-from-file`, `--models-gateway`, `--hook-*`, …) are +also accepted. + +## Open WebUI Integration + +Open WebUI can talk to any OpenAI-compatible endpoint. To plug docker-agent +in: + +1. Start the chat server, optionally with auth: + + ```bash + $ docker agent serve chat agent.yaml \ + --listen 127.0.0.1:8083 \ + --cors-origin http://localhost:3000 \ + --api-key-env OPEN_WEBUI_TOKEN + ``` + +2. In Open WebUI, add an OpenAI-compatible connection: + + - **API Base URL:** `http://127.0.0.1:8083/v1` + - **API Key:** the value of `OPEN_WEBUI_TOKEN` + +3. Each agent in your config appears as a selectable model. + +
+
ℹ️ See also +
+

For the docker-agent–native HTTP API (sessions, tool-call confirmation, runtime events), see the API Server. For full CLI flag documentation, see the CLI Reference.

+ +
diff --git a/docs/features/cli/index.md b/docs/features/cli/index.md index 3417edd29..d8b451114 100644 --- a/docs/features/cli/index.md +++ b/docs/features/cli/index.md @@ -65,8 +65,8 @@ $ docker agent run [config] [message...] [flags] $ docker agent run agent.yaml $ docker agent run agent.yaml "Fix the bug in auth.go" $ docker agent run agent.yaml -a developer --yolo -$ docker agent run agent.yaml --model anthropic/claude-sonnet-4-0 -$ docker agent run agent.yaml --model "dev=openai/gpt-4o,reviewer=anthropic/claude-sonnet-4-0" +$ docker agent run agent.yaml --model anthropic/claude-sonnet-4-5 +$ docker agent run agent.yaml --model "dev=openai/gpt-4o,reviewer=anthropic/claude-sonnet-4-5" $ docker agent run agent.yaml --session -1 # resume last session $ docker agent run agent.yaml --prompt-file ./context.md # include file as context @@ -265,6 +265,8 @@ $ curl http://127.0.0.1:8083/v1/chat/completions \ -d '{"model": "root", "messages": [{"role": "user", "content": "hello"}]}' ``` +See [Chat Server]({{ '/features/chat-server/' | relative_url }}) for the full feature reference. + ### `docker agent share push` / `docker agent share pull` Share agents via OCI registries. @@ -344,7 +346,7 @@ $ docker agent alias add other ociReference # Add an alias with runtime options $ docker agent alias add yolo-coder agentcatalog/coder --yolo $ docker agent alias add fast-coder agentcatalog/coder --model openai/gpt-4o-mini -$ docker agent alias add turbo agentcatalog/coder --yolo --model anthropic/claude-sonnet-4-0 +$ docker agent alias add turbo agentcatalog/coder --yolo --model anthropic/claude-sonnet-4-5 # Use an alias $ docker agent run pirate @@ -364,7 +366,7 @@ $ docker agent alias ls Registered aliases (3): fast-coder → agentcatalog/coder [model=openai/gpt-4o-mini] - turbo → agentcatalog/coder [yolo, model=anthropic/claude-sonnet-4-0] + turbo → agentcatalog/coder [yolo, model=anthropic/claude-sonnet-4-5] yolo-coder → agentcatalog/coder [yolo] Run an alias with: docker agent run diff --git a/docs/features/mcp-mode/index.md b/docs/features/mcp-mode/index.md index 9440df9c3..5b62742cf 100644 --- a/docs/features/mcp-mode/index.md +++ b/docs/features/mcp-mode/index.md @@ -108,14 +108,14 @@ When you expose a multi-agent configuration via MCP, each agent becomes a separa ```yaml agents: root: - model: anthropic/claude-sonnet-4-0 + model: anthropic/claude-sonnet-4-5 description: Main coordinator sub_agents: [designer, engineer] designer: model: openai/gpt-5-mini description: UI/UX design specialist engineer: - model: anthropic/claude-sonnet-4-0 + model: anthropic/claude-sonnet-4-5 description: Software engineer ``` diff --git a/docs/features/remote-mcp/index.md b/docs/features/remote-mcp/index.md index 82eb8be24..9177c4a21 100644 --- a/docs/features/remote-mcp/index.md +++ b/docs/features/remote-mcp/index.md @@ -188,7 +188,7 @@ Combine multiple remote MCP servers in a single agent: ```yaml agents: root: - model: anthropic/claude-sonnet-4-0 + model: anthropic/claude-sonnet-4-5 instruction: | You help manage projects and deployments. toolsets: diff --git a/docs/guides/go-sdk/index.md b/docs/guides/go-sdk/index.md index 72afb3e12..108aad4fe 100644 --- a/docs/guides/go-sdk/index.md +++ b/docs/guides/go-sdk/index.md @@ -294,7 +294,7 @@ openaiClient, _ := openai.NewClient(ctx, &latest.ModelConfig{ // Anthropic anthropicClient, _ := anthropic.NewClient(ctx, &latest.ModelConfig{ Provider: "anthropic", - Model: "claude-sonnet-4-0", + Model: "claude-sonnet-4-5", }, env) // Google Gemini diff --git a/docs/guides/tips/index.md b/docs/guides/tips/index.md index 89681a744..0bc50d379 100644 --- a/docs/guides/tips/index.md +++ b/docs/guides/tips/index.md @@ -130,7 +130,7 @@ Always set `max_iterations` for agents with powerful tools to prevent infinite l ```yaml agents: developer: - model: anthropic/claude-sonnet-4-0 + model: anthropic/claude-sonnet-4-5 description: Development assistant instruction: You are a developer. max_iterations: 30 # Reasonable limit for development tasks @@ -150,7 +150,7 @@ Configure fallback models for resilience against provider outages or rate limits ```yaml agents: root: - model: anthropic/claude-sonnet-4-0 + model: anthropic/claude-sonnet-4-5 description: Reliable assistant instruction: You are a helpful assistant. fallback: @@ -210,7 +210,7 @@ For defense in depth, use both permissions and [sandbox mode]({{ '/configuration ```yaml agents: secure_dev: - model: anthropic/claude-sonnet-4-0 + model: anthropic/claude-sonnet-4-5 description: Secure development assistant instruction: You are a secure coding assistant. toolsets: @@ -300,7 +300,7 @@ The root agent uses descriptions to decide which sub-agent to delegate to: ```yaml agents: root: - model: anthropic/claude-sonnet-4-0 + model: anthropic/claude-sonnet-4-5 description: Technical lead instruction: Delegate to specialists based on the task. sub_agents: [frontend, backend, devops] @@ -371,7 +371,7 @@ Set your preferred default model in `~/.config/cagent/config.yaml`: ```yaml settings: - default_model: anthropic/claude-sonnet-4-0 + default_model: anthropic/claude-sonnet-4-5 ``` This model is used when you run `docker agent run` without a config file. @@ -411,7 +411,7 @@ With a simple reviewer agent: # reviewer.yaml agents: root: - model: anthropic/claude-sonnet-4-0 + model: anthropic/claude-sonnet-4-5 description: PR reviewer instruction: | Review pull requests for code quality, bugs, and security issues. diff --git a/docs/index.md b/docs/index.md index 2d96855dd..6c47f39eb 100644 --- a/docs/index.md +++ b/docs/index.md @@ -60,7 +60,7 @@ Create a file called `agent.yaml`: ```yaml agents: root: - model: anthropic/claude-sonnet-4-0 + model: anthropic/claude-sonnet-4-5 description: A helpful coding assistant instruction: | You are an expert developer. Help users write clean, diff --git a/docs/providers/bedrock/index.md b/docs/providers/bedrock/index.md index ba7e7c8ca..a2703b469 100644 --- a/docs/providers/bedrock/index.md +++ b/docs/providers/bedrock/index.md @@ -64,7 +64,7 @@ models: models: bedrock: provider: amazon-bedrock - model: anthropic.claude-3-sonnet-20240229-v1:0 + model: global.anthropic.claude-sonnet-4-5-20250929-v1:0 provider_opts: role_arn: "arn:aws:iam::123456789012:role/BedrockAccessRole" external_id: "my-external-id" diff --git a/docs/providers/custom/index.md b/docs/providers/custom/index.md index 44350e3ca..14829e96a 100644 --- a/docs/providers/custom/index.md +++ b/docs/providers/custom/index.md @@ -181,7 +181,7 @@ providers: agents: root: - model: router/anthropic/claude-sonnet-4-0 + model: router/anthropic/claude-sonnet-4-5 ``` ### Azure OpenAI diff --git a/docs/providers/openai/index.md b/docs/providers/openai/index.md index c4c655885..965893e1a 100644 --- a/docs/providers/openai/index.md +++ b/docs/providers/openai/index.md @@ -45,7 +45,7 @@ models: | `gpt-4o` | Multimodal, balanced performance | | `gpt-4o-mini` | Cheapest, fast for simple tasks | -Find more model names at [modelname.ai](https://modelname.ai/). +Find more model names at [modelnames.ai](https://modelnames.ai/) or in the [official OpenAI docs](https://platform.openai.com/docs/models). ## Thinking Budget diff --git a/docs/tools/lsp/index.md b/docs/tools/lsp/index.md index 88ac3c6b6..61882e18e 100644 --- a/docs/tools/lsp/index.md +++ b/docs/tools/lsp/index.md @@ -45,7 +45,7 @@ The LSP toolset provides these tools to the agent: ```yaml agents: developer: - model: anthropic/claude-sonnet-4-0 + model: anthropic/claude-sonnet-4-5 description: Code developer with LSP support instruction: You are a software developer. toolsets: @@ -136,7 +136,7 @@ You can configure multiple LSP servers for different file types: ```yaml agents: polyglot: - model: anthropic/claude-sonnet-4-0 + model: anthropic/claude-sonnet-4-5 description: Multi-language developer instruction: You are a full-stack developer. toolsets: diff --git a/docs/tools/transfer-task/index.md b/docs/tools/transfer-task/index.md index c48eda6f4..806d09cdd 100644 --- a/docs/tools/transfer-task/index.md +++ b/docs/tools/transfer-task/index.md @@ -27,7 +27,7 @@ agents: sub_agents: [developer, researcher] developer: - model: anthropic/claude-sonnet-4-0 + model: anthropic/claude-sonnet-4-5 description: Expert software developer instruction: Write clean, production-ready code. toolsets: