Skip to content

Commit 00b255f

Browse files
viniciusdsmelloclaude
authored andcommitted
feat(closes OPEN-10634): Claude Agent SDK Python integration
Add tracing integration for Anthropic's Claude Agent SDK (claude-agent-sdk on PyPI). - One-line setup: trace_claude_agent_sdk() monkey-patches claude_agent_sdk.query and ClaudeSDKClient so every call is auto-traced. traced_query() available for per-call scoping. - Trace shape: root AGENT step "Claude Agent SDK query" per query() call, with nested CHAT_COMPLETION per assistant turn and TOOL per tool call. Subagent dispatches via the Agent tool become nested AGENT steps; the subagent's chats and tools nest underneath via parent_tool_use_id. - Hooks (PreToolUse / PostToolUse / PostToolUseFailure) compose with any user-provided hooks; we never replace. - Captures: cost, tokens, duration, session_id, model, system_prompt, agent_config (resolved tools / mcp_servers / skills / plugins), agents_defined (subagent definitions), permission_denials, model_usage, raw assistant messages, raw ResultMessage. MCP server env / headers / authorization are redacted by default. Tests: 16 unit tests + 1 live integration test (gated on ANTHROPIC_API_KEY). Live test runs successfully against the real SDK and Openlayer ingest. Example: examples/tracing/claude_agent_sdk/claude_agent_sdk_tracing.ipynb covers three scenarios in one notebook — basic query with built-in tools, MCP + subagent dispatch, and multi-stage orchestration wrapping multiple query() calls in tracer.create_step(). Closes OPEN-10634. Parallel TypeScript work in openlayer-ai/openlayer-ts under OPEN-10635. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 52a313b commit 00b255f

8 files changed

Lines changed: 2463 additions & 1 deletion

File tree

Lines changed: 284 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,284 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"metadata": {},
6+
"source": [
7+
"[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/openlayer-ai/openlayer-python/blob/main/examples/tracing/claude_agent_sdk/claude_agent_sdk_tracing.ipynb)\n",
8+
"\n",
9+
"# Tracing the Claude Agent SDK with Openlayer\n",
10+
"\n",
11+
"This notebook shows how to enable Openlayer tracing for applications built with Anthropic's [Claude Agent SDK](https://github.com/anthropics/claude-agent-sdk-python). After one line of setup, every `query()` becomes an Openlayer trace with nested steps for assistant turns, tool calls (built-in + MCP), subagents, session metadata, cost, and tokens.\n",
12+
"\n",
13+
"Three scenarios, building up in complexity:\n",
14+
"\n",
15+
"1. **Quickstart** — single `query()` with built-in tools (Read / Glob / Grep)\n",
16+
"2. **MCP + subagent** — register an in-process MCP tool, dispatch a subagent\n",
17+
"3. **Multi-stage orchestration** — wrap multiple `query()` calls inside one outer step so the whole pipeline is a single trace"
18+
]
19+
},
20+
{
21+
"cell_type": "markdown",
22+
"metadata": {},
23+
"source": [
24+
"## 1. Install dependencies"
25+
]
26+
},
27+
{
28+
"cell_type": "code",
29+
"execution_count": null,
30+
"metadata": {},
31+
"outputs": [],
32+
"source": [
33+
"!pip install openlayer 'claude-agent-sdk>=0.1.81'"
34+
]
35+
},
36+
{
37+
"cell_type": "markdown",
38+
"metadata": {},
39+
"source": [
40+
"## 2. Set environment variables\n",
41+
"\n",
42+
"You need three secrets:\n",
43+
"\n",
44+
"- `OPENLAYER_API_KEY` — get from [openlayer.com/settings/api-keys](https://app.openlayer.com/settings/api-keys)\n",
45+
"- `OPENLAYER_INFERENCE_PIPELINE_ID` — the inference pipeline you want to stream traces to\n",
46+
"- `ANTHROPIC_API_KEY` — your Anthropic API key"
47+
]
48+
},
49+
{
50+
"cell_type": "code",
51+
"execution_count": null,
52+
"metadata": {},
53+
"outputs": [],
54+
"source": [
55+
"import os\n",
56+
"\n",
57+
"os.environ[\"OPENLAYER_API_KEY\"] = \"YOUR_OPENLAYER_API_KEY\"\n",
58+
"os.environ[\"OPENLAYER_INFERENCE_PIPELINE_ID\"] = \"YOUR_INFERENCE_PIPELINE_ID\"\n",
59+
"os.environ[\"ANTHROPIC_API_KEY\"] = \"YOUR_ANTHROPIC_API_KEY\""
60+
]
61+
},
62+
{
63+
"cell_type": "markdown",
64+
"metadata": {},
65+
"source": [
66+
"## 3. Enable tracing — one line\n",
67+
"\n",
68+
"`trace_claude_agent_sdk()` monkey-patches `claude_agent_sdk.query` and `ClaudeSDKClient` so every subsequent call is auto-traced. It composes with any hooks you've configured yourself — your hooks are not replaced."
69+
]
70+
},
71+
{
72+
"cell_type": "code",
73+
"execution_count": null,
74+
"metadata": {},
75+
"outputs": [],
76+
"source": [
77+
"from openlayer.lib import trace_claude_agent_sdk\n",
78+
"\n",
79+
"trace_claude_agent_sdk()"
80+
]
81+
},
82+
{
83+
"cell_type": "markdown",
84+
"metadata": {},
85+
"source": [
86+
"## 4. Scenario 1 — quickstart\n",
87+
"\n",
88+
"A simple `query()` with read-only built-in tools. The resulting trace contains one root `Claude Agent SDK query` AGENT step with nested `CHAT_COMPLETION` turns and `TOOL` calls."
89+
]
90+
},
91+
{
92+
"cell_type": "code",
93+
"execution_count": null,
94+
"metadata": {},
95+
"outputs": [],
96+
"source": [
97+
"from claude_agent_sdk import ResultMessage, ClaudeAgentOptions, query\n",
98+
"\n",
99+
"\n",
100+
"async def scenario_1():\n",
101+
" options = ClaudeAgentOptions(\n",
102+
" model=\"claude-haiku-4-5\",\n",
103+
" allowed_tools=[\"Read\", \"Glob\", \"Grep\"],\n",
104+
" )\n",
105+
" async for message in query(\n",
106+
" prompt=\"Find any .py files in the current directory and tell me roughly what they do.\",\n",
107+
" options=options,\n",
108+
" ):\n",
109+
" if isinstance(message, ResultMessage):\n",
110+
" print(message.result) # noqa: T201\n",
111+
"\n",
112+
"\n",
113+
"await scenario_1()"
114+
]
115+
},
116+
{
117+
"cell_type": "markdown",
118+
"metadata": {},
119+
"source": [
120+
"## 5. Scenario 2 — in-process MCP tool + subagent\n",
121+
"\n",
122+
"Register a custom MCP tool that counts files by extension, and dispatch a `code-reviewer` subagent. In the trace, the MCP call appears as a `TOOL` step with `metadata.mcp_server=\"file-stats\"` and `metadata.mcp_tool_name=\"count_files\"`. The subagent dispatch appears as a nested `AGENT` step (`Agent: code-reviewer`) containing the subagent's own assistant turns and tool calls."
123+
]
124+
},
125+
{
126+
"cell_type": "code",
127+
"execution_count": null,
128+
"metadata": {},
129+
"outputs": [],
130+
"source": [
131+
"from pathlib import Path\n",
132+
"from collections import Counter\n",
133+
"\n",
134+
"from claude_agent_sdk import AgentDefinition, tool, create_sdk_mcp_server\n",
135+
"\n",
136+
"\n",
137+
"@tool(\"count_files\", \"Count files in a directory grouped by extension\", {\"directory\": str})\n",
138+
"async def count_files(args):\n",
139+
" target = Path(args[\"directory\"]).expanduser().resolve()\n",
140+
" if not target.is_dir():\n",
141+
" return {\"content\": [{\"type\": \"text\", \"text\": f\"Not a directory: {target}\"}], \"isError\": True}\n",
142+
" counts = Counter()\n",
143+
" for f in target.rglob(\"*\"):\n",
144+
" if f.is_file():\n",
145+
" counts[f.suffix or \"(no ext)\"] += 1\n",
146+
" body = \"\\n\".join(f\"{ext}: {n}\" for ext, n in counts.most_common(20))\n",
147+
" return {\"content\": [{\"type\": \"text\", \"text\": body or \"(empty)\"}]}\n",
148+
"\n",
149+
"\n",
150+
"mcp_server = create_sdk_mcp_server(\"file-stats\", \"1.0.0\", tools=[count_files])\n",
151+
"\n",
152+
"code_reviewer = AgentDefinition(\n",
153+
" description=\"Briefly reviews a code file for clarity, correctness, and style.\",\n",
154+
" prompt=(\n",
155+
" \"You are a senior code reviewer. Read the file the user names, then return ONE \"\n",
156+
" \"specific observation about its quality. Two sentences max.\"\n",
157+
" ),\n",
158+
" tools=[\"Read\", \"Grep\"],\n",
159+
" model=\"claude-haiku-4-5\",\n",
160+
")\n",
161+
"\n",
162+
"\n",
163+
"async def scenario_2():\n",
164+
" options = ClaudeAgentOptions(\n",
165+
" model=\"claude-haiku-4-5\",\n",
166+
" system_prompt=(\n",
167+
" \"You are a codebase explorer. Count files in the directory, then dispatch \"\n",
168+
" \"the code-reviewer subagent on ONE interesting file. Output a 2-line summary.\"\n",
169+
" ),\n",
170+
" # Subagent tools must also be in the session's allowed_tools.\n",
171+
" allowed_tools=[\"Glob\", \"Read\", \"Grep\", \"Agent\", \"mcp__file-stats__count_files\"],\n",
172+
" mcp_servers={\"file-stats\": mcp_server},\n",
173+
" agents={\"code-reviewer\": code_reviewer},\n",
174+
" permission_mode=\"acceptEdits\",\n",
175+
" max_turns=10,\n",
176+
" )\n",
177+
" async for message in query(\n",
178+
" prompt=f\"Analyze the directory at: {Path.cwd()}\",\n",
179+
" options=options,\n",
180+
" ):\n",
181+
" if isinstance(message, ResultMessage):\n",
182+
" print(message.result) # noqa: T201\n",
183+
"\n",
184+
"\n",
185+
"await scenario_2()"
186+
]
187+
},
188+
{
189+
"cell_type": "markdown",
190+
"metadata": {},
191+
"source": [
192+
"## 6. Scenario 3 — multi-stage orchestration\n",
193+
"\n",
194+
"When you want multiple `query()` calls to appear as one trace, wrap them in `tracer.create_step()`. Each inner `query()` becomes a nested `AGENT` step under your outer step.\n",
195+
"\n",
196+
"This example splits an audit workflow into two phases: an inventory query, then a review query that dispatches a specialist subagent. Both are children of one outer `codebase-audit` AGENT step."
197+
]
198+
},
199+
{
200+
"cell_type": "code",
201+
"execution_count": null,
202+
"metadata": {},
203+
"outputs": [],
204+
"source": [
205+
"from openlayer.lib.tracing import tracer\n",
206+
"from openlayer.lib.tracing.enums import StepType\n",
207+
"\n",
208+
"\n",
209+
"async def phase_inventory():\n",
210+
" options = ClaudeAgentOptions(\n",
211+
" model=\"claude-haiku-4-5\",\n",
212+
" system_prompt=(\n",
213+
" \"Inventory the current working directory and pick ONE .py file. \"\n",
214+
" \"End your last message with: TARGET: <absolute path>\"\n",
215+
" ),\n",
216+
" allowed_tools=[\"Glob\", \"Read\", \"mcp__file-stats__count_files\"],\n",
217+
" mcp_servers={\"file-stats\": mcp_server},\n",
218+
" max_turns=6,\n",
219+
" )\n",
220+
" async for message in query(prompt=f\"Working directory: {Path.cwd()}\", options=options):\n",
221+
" if isinstance(message, ResultMessage):\n",
222+
" for line in reversed((message.result or \"\").splitlines()):\n",
223+
" if line.strip().startswith(\"TARGET:\"):\n",
224+
" return line.strip()[len(\"TARGET:\"):].strip()\n",
225+
" return None\n",
226+
"\n",
227+
"\n",
228+
"async def phase_review(target):\n",
229+
" options = ClaudeAgentOptions(\n",
230+
" model=\"claude-haiku-4-5\",\n",
231+
" system_prompt=\"Dispatch code-reviewer on the file and return its observation verbatim.\",\n",
232+
" allowed_tools=[\"Agent\", \"Read\", \"Grep\"],\n",
233+
" agents={\"code-reviewer\": code_reviewer},\n",
234+
" permission_mode=\"acceptEdits\",\n",
235+
" max_turns=6,\n",
236+
" )\n",
237+
" async for message in query(prompt=f\"Review this file: {target}\", options=options):\n",
238+
" if isinstance(message, ResultMessage):\n",
239+
" return message.result\n",
240+
" return None\n",
241+
"\n",
242+
"\n",
243+
"with tracer.create_step(name=\"codebase-audit\", step_type=StepType.AGENT) as outer:\n",
244+
" target = await phase_inventory()\n",
245+
" review = await phase_review(target) if target else None\n",
246+
" outer.output = review or \"(no review produced)\"\n",
247+
" outer.log(metadata={\"audited_file\": target})\n",
248+
"\n",
249+
"print(\"audited:\", target) # noqa: T201\n",
250+
"print(\"\\nreview:\\n\", review) # noqa: T201"
251+
]
252+
},
253+
{
254+
"cell_type": "markdown",
255+
"metadata": {},
256+
"source": [
257+
"## 7. What to look for in the Openlayer trace\n",
258+
"\n",
259+
"Open your inference pipeline and click into each trace. You should see:\n",
260+
"\n",
261+
"**Scenario 1** — a single root `AGENT` step (`Claude Agent SDK query`) with assistant turn(s) and tool calls as children.\n",
262+
"\n",
263+
"**Scenario 2** — same root, plus a `TOOL` step for the MCP call (with `metadata.mcp_server` and `metadata.mcp_tool_name`) and a nested `AGENT` step named `Agent: code-reviewer` containing the subagent's own chat completions and tool steps.\n",
264+
"\n",
265+
"**Scenario 3** — one outer `codebase-audit` AGENT step, with two nested `Claude Agent SDK query` AGENT steps inside it (one per phase), and the review phase contains its own `Agent: code-reviewer` nested step.\n",
266+
"\n",
267+
"Click any `AGENT` step to see `system_prompt`, `agent_config`, `agents_defined`, `options`, and the raw `ResultMessage`. Click any `CHAT_COMPLETION` step for per-turn model, prompt/completion tokens, thinking content, and raw assistant message. Click any `TOOL` step for input, output, latency, and the originating `tool_use_id`."
268+
]
269+
}
270+
],
271+
"metadata": {
272+
"kernelspec": {
273+
"display_name": "Python 3",
274+
"language": "python",
275+
"name": "python3"
276+
},
277+
"language_info": {
278+
"name": "python",
279+
"version": "3.10"
280+
}
281+
},
282+
"nbformat": 4,
283+
"nbformat_minor": 4
284+
}

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -159,7 +159,7 @@ exclude = [
159159
".git",
160160
]
161161

162-
ignore = ["src/openlayer/lib/*", "examples/*"]
162+
ignore = ["src/openlayer/lib/*", "examples/*", "tests/integrations/*"]
163163

164164
reportImplicitOverride = true
165165
reportOverlappingOverload = false

src/openlayer/lib/__init__.py

Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,8 @@
2020
"trace_portkey",
2121
"trace_google_adk",
2222
"unpatch_google_adk",
23+
"trace_claude_agent_sdk",
24+
"traced_claude_agent_sdk_query",
2325
"trace_gemini",
2426
"update_current_trace",
2527
"update_current_step",
@@ -315,6 +317,80 @@ def unpatch_google_adk():
315317
return google_adk_tracer.unpatch_google_adk()
316318

317319

320+
# ------------------------------ Claude Agent SDK ---------------------------- #
321+
def trace_claude_agent_sdk(
322+
*,
323+
inference_pipeline_id=None,
324+
truncate_tool_output_chars: int = 8192,
325+
capture_thinking: bool = True,
326+
redact_mcp_env: bool = True,
327+
):
328+
"""Enable Openlayer tracing for the Claude Agent SDK.
329+
330+
Monkey-patches ``claude_agent_sdk.query`` and ``ClaudeSDKClient`` so every
331+
call becomes an Openlayer trace with nested steps for assistant turns,
332+
tool calls (including MCP and subagent calls), session metadata, cost,
333+
and tokens.
334+
335+
Requirements:
336+
``claude-agent-sdk>=0.1.81`` must be installed:
337+
``pip install 'claude-agent-sdk>=0.1.81'``
338+
339+
Args:
340+
inference_pipeline_id: Optional Openlayer inference pipeline ID. Falls
341+
back to the ``OPENLAYER_INFERENCE_PIPELINE_ID`` env var.
342+
truncate_tool_output_chars: Maximum characters of tool output to
343+
capture per TOOL step. Defaults to 8192.
344+
capture_thinking: Whether to capture ``ThinkingBlock`` content into
345+
chat-completion step metadata. Defaults to True.
346+
redact_mcp_env: Whether to strip ``env`` and ``headers`` from MCP
347+
server config dicts in trace metadata. Defaults to True.
348+
349+
Example:
350+
>>> import os
351+
>>> os.environ["OPENLAYER_API_KEY"] = "..."
352+
>>> os.environ["OPENLAYER_INFERENCE_PIPELINE_ID"] = "..."
353+
>>> os.environ["ANTHROPIC_API_KEY"] = "..."
354+
>>> from openlayer.lib import trace_claude_agent_sdk
355+
>>> trace_claude_agent_sdk()
356+
>>>
357+
>>> from claude_agent_sdk import query, ClaudeAgentOptions
358+
>>> async for m in query(prompt="hello", options=ClaudeAgentOptions(model="claude-haiku-4-5")):
359+
... ...
360+
"""
361+
# pylint: disable=import-outside-toplevel
362+
from .integrations import claude_agent_sdk as _integration
363+
364+
return _integration.trace_claude_agent_sdk(
365+
inference_pipeline_id=inference_pipeline_id,
366+
truncate_tool_output_chars=truncate_tool_output_chars,
367+
capture_thinking=capture_thinking,
368+
redact_mcp_env=redact_mcp_env,
369+
)
370+
371+
372+
def traced_claude_agent_sdk_query(*, prompt, options=None, inference_pipeline_id=None, **kwargs):
373+
"""Per-call wrapper around ``claude_agent_sdk.query()`` (alternative to global init).
374+
375+
Returns an async iterator that yields the same messages as ``query()`` while
376+
emitting an Openlayer trace as a side effect.
377+
378+
Example:
379+
>>> from openlayer.lib import traced_claude_agent_sdk_query
380+
>>> async for m in traced_claude_agent_sdk_query(prompt="hello"):
381+
... ...
382+
"""
383+
# pylint: disable=import-outside-toplevel
384+
from .integrations import claude_agent_sdk as _integration
385+
386+
return _integration.traced_query(
387+
prompt=prompt,
388+
options=options,
389+
inference_pipeline_id=inference_pipeline_id,
390+
**kwargs,
391+
)
392+
393+
318394
# -------------------------------- Google Gemini --------------------------------- #
319395
def trace_gemini(client):
320396
"""Trace Google Gemini chat completions."""

0 commit comments

Comments
 (0)