|
| 1 | +{ |
| 2 | + "cells": [ |
| 3 | + { |
| 4 | + "cell_type": "markdown", |
| 5 | + "metadata": {}, |
| 6 | + "source": [ |
| 7 | + "[](https://colab.research.google.com/github/openlayer-ai/openlayer-python/blob/main/examples/tracing/claude_agent_sdk/claude_agent_sdk_tracing.ipynb)\n", |
| 8 | + "\n", |
| 9 | + "# Tracing the Claude Agent SDK with Openlayer\n", |
| 10 | + "\n", |
| 11 | + "This notebook shows how to enable Openlayer tracing for applications built with Anthropic's [Claude Agent SDK](https://github.com/anthropics/claude-agent-sdk-python). After one line of setup, every `query()` becomes an Openlayer trace with nested steps for assistant turns, tool calls (built-in + MCP), subagents, session metadata, cost, and tokens.\n", |
| 12 | + "\n", |
| 13 | + "Three scenarios, building up in complexity:\n", |
| 14 | + "\n", |
| 15 | + "1. **Quickstart** — single `query()` with built-in tools (Read / Glob / Grep)\n", |
| 16 | + "2. **MCP + subagent** — register an in-process MCP tool, dispatch a subagent\n", |
| 17 | + "3. **Multi-stage orchestration** — wrap multiple `query()` calls inside one outer step so the whole pipeline is a single trace" |
| 18 | + ] |
| 19 | + }, |
| 20 | + { |
| 21 | + "cell_type": "markdown", |
| 22 | + "metadata": {}, |
| 23 | + "source": [ |
| 24 | + "## 1. Install dependencies" |
| 25 | + ] |
| 26 | + }, |
| 27 | + { |
| 28 | + "cell_type": "code", |
| 29 | + "execution_count": null, |
| 30 | + "metadata": {}, |
| 31 | + "outputs": [], |
| 32 | + "source": [ |
| 33 | + "!pip install openlayer 'claude-agent-sdk>=0.1.81'" |
| 34 | + ] |
| 35 | + }, |
| 36 | + { |
| 37 | + "cell_type": "markdown", |
| 38 | + "metadata": {}, |
| 39 | + "source": [ |
| 40 | + "## 2. Set environment variables\n", |
| 41 | + "\n", |
| 42 | + "You need three secrets:\n", |
| 43 | + "\n", |
| 44 | + "- `OPENLAYER_API_KEY` — get from [openlayer.com/settings/api-keys](https://app.openlayer.com/settings/api-keys)\n", |
| 45 | + "- `OPENLAYER_INFERENCE_PIPELINE_ID` — the inference pipeline you want to stream traces to\n", |
| 46 | + "- `ANTHROPIC_API_KEY` — your Anthropic API key" |
| 47 | + ] |
| 48 | + }, |
| 49 | + { |
| 50 | + "cell_type": "code", |
| 51 | + "execution_count": null, |
| 52 | + "metadata": {}, |
| 53 | + "outputs": [], |
| 54 | + "source": [ |
| 55 | + "import os\n", |
| 56 | + "\n", |
| 57 | + "os.environ[\"OPENLAYER_API_KEY\"] = \"YOUR_OPENLAYER_API_KEY\"\n", |
| 58 | + "os.environ[\"OPENLAYER_INFERENCE_PIPELINE_ID\"] = \"YOUR_INFERENCE_PIPELINE_ID\"\n", |
| 59 | + "os.environ[\"ANTHROPIC_API_KEY\"] = \"YOUR_ANTHROPIC_API_KEY\"" |
| 60 | + ] |
| 61 | + }, |
| 62 | + { |
| 63 | + "cell_type": "markdown", |
| 64 | + "metadata": {}, |
| 65 | + "source": [ |
| 66 | + "## 3. Enable tracing — one line\n", |
| 67 | + "\n", |
| 68 | + "`trace_claude_agent_sdk()` monkey-patches `claude_agent_sdk.query` and `ClaudeSDKClient` so every subsequent call is auto-traced. It composes with any hooks you've configured yourself — your hooks are not replaced." |
| 69 | + ] |
| 70 | + }, |
| 71 | + { |
| 72 | + "cell_type": "code", |
| 73 | + "execution_count": null, |
| 74 | + "metadata": {}, |
| 75 | + "outputs": [], |
| 76 | + "source": [ |
| 77 | + "from openlayer.lib import trace_claude_agent_sdk\n", |
| 78 | + "\n", |
| 79 | + "trace_claude_agent_sdk()" |
| 80 | + ] |
| 81 | + }, |
| 82 | + { |
| 83 | + "cell_type": "markdown", |
| 84 | + "metadata": {}, |
| 85 | + "source": [ |
| 86 | + "## 4. Scenario 1 — quickstart\n", |
| 87 | + "\n", |
| 88 | + "A simple `query()` with read-only built-in tools. The resulting trace contains one root `Claude Agent SDK query` AGENT step with nested `CHAT_COMPLETION` turns and `TOOL` calls." |
| 89 | + ] |
| 90 | + }, |
| 91 | + { |
| 92 | + "cell_type": "code", |
| 93 | + "execution_count": null, |
| 94 | + "metadata": {}, |
| 95 | + "outputs": [], |
| 96 | + "source": [ |
| 97 | + "from claude_agent_sdk import ResultMessage, ClaudeAgentOptions, query\n", |
| 98 | + "\n", |
| 99 | + "\n", |
| 100 | + "async def scenario_1():\n", |
| 101 | + " options = ClaudeAgentOptions(\n", |
| 102 | + " model=\"claude-haiku-4-5\",\n", |
| 103 | + " allowed_tools=[\"Read\", \"Glob\", \"Grep\"],\n", |
| 104 | + " )\n", |
| 105 | + " async for message in query(\n", |
| 106 | + " prompt=\"Find any .py files in the current directory and tell me roughly what they do.\",\n", |
| 107 | + " options=options,\n", |
| 108 | + " ):\n", |
| 109 | + " if isinstance(message, ResultMessage):\n", |
| 110 | + " print(message.result) # noqa: T201\n", |
| 111 | + "\n", |
| 112 | + "\n", |
| 113 | + "await scenario_1()" |
| 114 | + ] |
| 115 | + }, |
| 116 | + { |
| 117 | + "cell_type": "markdown", |
| 118 | + "metadata": {}, |
| 119 | + "source": [ |
| 120 | + "## 5. Scenario 2 — in-process MCP tool + subagent\n", |
| 121 | + "\n", |
| 122 | + "Register a custom MCP tool that counts files by extension, and dispatch a `code-reviewer` subagent. In the trace, the MCP call appears as a `TOOL` step with `metadata.mcp_server=\"file-stats\"` and `metadata.mcp_tool_name=\"count_files\"`. The subagent dispatch appears as a nested `AGENT` step (`Agent: code-reviewer`) containing the subagent's own assistant turns and tool calls." |
| 123 | + ] |
| 124 | + }, |
| 125 | + { |
| 126 | + "cell_type": "code", |
| 127 | + "execution_count": null, |
| 128 | + "metadata": {}, |
| 129 | + "outputs": [], |
| 130 | + "source": [ |
| 131 | + "from pathlib import Path\n", |
| 132 | + "from collections import Counter\n", |
| 133 | + "\n", |
| 134 | + "from claude_agent_sdk import AgentDefinition, tool, create_sdk_mcp_server\n", |
| 135 | + "\n", |
| 136 | + "\n", |
| 137 | + "@tool(\"count_files\", \"Count files in a directory grouped by extension\", {\"directory\": str})\n", |
| 138 | + "async def count_files(args):\n", |
| 139 | + " target = Path(args[\"directory\"]).expanduser().resolve()\n", |
| 140 | + " if not target.is_dir():\n", |
| 141 | + " return {\"content\": [{\"type\": \"text\", \"text\": f\"Not a directory: {target}\"}], \"isError\": True}\n", |
| 142 | + " counts = Counter()\n", |
| 143 | + " for f in target.rglob(\"*\"):\n", |
| 144 | + " if f.is_file():\n", |
| 145 | + " counts[f.suffix or \"(no ext)\"] += 1\n", |
| 146 | + " body = \"\\n\".join(f\"{ext}: {n}\" for ext, n in counts.most_common(20))\n", |
| 147 | + " return {\"content\": [{\"type\": \"text\", \"text\": body or \"(empty)\"}]}\n", |
| 148 | + "\n", |
| 149 | + "\n", |
| 150 | + "mcp_server = create_sdk_mcp_server(\"file-stats\", \"1.0.0\", tools=[count_files])\n", |
| 151 | + "\n", |
| 152 | + "code_reviewer = AgentDefinition(\n", |
| 153 | + " description=\"Briefly reviews a code file for clarity, correctness, and style.\",\n", |
| 154 | + " prompt=(\n", |
| 155 | + " \"You are a senior code reviewer. Read the file the user names, then return ONE \"\n", |
| 156 | + " \"specific observation about its quality. Two sentences max.\"\n", |
| 157 | + " ),\n", |
| 158 | + " tools=[\"Read\", \"Grep\"],\n", |
| 159 | + " model=\"claude-haiku-4-5\",\n", |
| 160 | + ")\n", |
| 161 | + "\n", |
| 162 | + "\n", |
| 163 | + "async def scenario_2():\n", |
| 164 | + " options = ClaudeAgentOptions(\n", |
| 165 | + " model=\"claude-haiku-4-5\",\n", |
| 166 | + " system_prompt=(\n", |
| 167 | + " \"You are a codebase explorer. Count files in the directory, then dispatch \"\n", |
| 168 | + " \"the code-reviewer subagent on ONE interesting file. Output a 2-line summary.\"\n", |
| 169 | + " ),\n", |
| 170 | + " # Subagent tools must also be in the session's allowed_tools.\n", |
| 171 | + " allowed_tools=[\"Glob\", \"Read\", \"Grep\", \"Agent\", \"mcp__file-stats__count_files\"],\n", |
| 172 | + " mcp_servers={\"file-stats\": mcp_server},\n", |
| 173 | + " agents={\"code-reviewer\": code_reviewer},\n", |
| 174 | + " permission_mode=\"acceptEdits\",\n", |
| 175 | + " max_turns=10,\n", |
| 176 | + " )\n", |
| 177 | + " async for message in query(\n", |
| 178 | + " prompt=f\"Analyze the directory at: {Path.cwd()}\",\n", |
| 179 | + " options=options,\n", |
| 180 | + " ):\n", |
| 181 | + " if isinstance(message, ResultMessage):\n", |
| 182 | + " print(message.result) # noqa: T201\n", |
| 183 | + "\n", |
| 184 | + "\n", |
| 185 | + "await scenario_2()" |
| 186 | + ] |
| 187 | + }, |
| 188 | + { |
| 189 | + "cell_type": "markdown", |
| 190 | + "metadata": {}, |
| 191 | + "source": [ |
| 192 | + "## 6. Scenario 3 — multi-stage orchestration\n", |
| 193 | + "\n", |
| 194 | + "When you want multiple `query()` calls to appear as one trace, wrap them in `tracer.create_step()`. Each inner `query()` becomes a nested `AGENT` step under your outer step.\n", |
| 195 | + "\n", |
| 196 | + "This example splits an audit workflow into two phases: an inventory query, then a review query that dispatches a specialist subagent. Both are children of one outer `codebase-audit` AGENT step." |
| 197 | + ] |
| 198 | + }, |
| 199 | + { |
| 200 | + "cell_type": "code", |
| 201 | + "execution_count": null, |
| 202 | + "metadata": {}, |
| 203 | + "outputs": [], |
| 204 | + "source": [ |
| 205 | + "from openlayer.lib.tracing import tracer\n", |
| 206 | + "from openlayer.lib.tracing.enums import StepType\n", |
| 207 | + "\n", |
| 208 | + "\n", |
| 209 | + "async def phase_inventory():\n", |
| 210 | + " options = ClaudeAgentOptions(\n", |
| 211 | + " model=\"claude-haiku-4-5\",\n", |
| 212 | + " system_prompt=(\n", |
| 213 | + " \"Inventory the current working directory and pick ONE .py file. \"\n", |
| 214 | + " \"End your last message with: TARGET: <absolute path>\"\n", |
| 215 | + " ),\n", |
| 216 | + " allowed_tools=[\"Glob\", \"Read\", \"mcp__file-stats__count_files\"],\n", |
| 217 | + " mcp_servers={\"file-stats\": mcp_server},\n", |
| 218 | + " max_turns=6,\n", |
| 219 | + " )\n", |
| 220 | + " async for message in query(prompt=f\"Working directory: {Path.cwd()}\", options=options):\n", |
| 221 | + " if isinstance(message, ResultMessage):\n", |
| 222 | + " for line in reversed((message.result or \"\").splitlines()):\n", |
| 223 | + " if line.strip().startswith(\"TARGET:\"):\n", |
| 224 | + " return line.strip()[len(\"TARGET:\"):].strip()\n", |
| 225 | + " return None\n", |
| 226 | + "\n", |
| 227 | + "\n", |
| 228 | + "async def phase_review(target):\n", |
| 229 | + " options = ClaudeAgentOptions(\n", |
| 230 | + " model=\"claude-haiku-4-5\",\n", |
| 231 | + " system_prompt=\"Dispatch code-reviewer on the file and return its observation verbatim.\",\n", |
| 232 | + " allowed_tools=[\"Agent\", \"Read\", \"Grep\"],\n", |
| 233 | + " agents={\"code-reviewer\": code_reviewer},\n", |
| 234 | + " permission_mode=\"acceptEdits\",\n", |
| 235 | + " max_turns=6,\n", |
| 236 | + " )\n", |
| 237 | + " async for message in query(prompt=f\"Review this file: {target}\", options=options):\n", |
| 238 | + " if isinstance(message, ResultMessage):\n", |
| 239 | + " return message.result\n", |
| 240 | + " return None\n", |
| 241 | + "\n", |
| 242 | + "\n", |
| 243 | + "with tracer.create_step(name=\"codebase-audit\", step_type=StepType.AGENT) as outer:\n", |
| 244 | + " target = await phase_inventory()\n", |
| 245 | + " review = await phase_review(target) if target else None\n", |
| 246 | + " outer.output = review or \"(no review produced)\"\n", |
| 247 | + " outer.log(metadata={\"audited_file\": target})\n", |
| 248 | + "\n", |
| 249 | + "print(\"audited:\", target) # noqa: T201\n", |
| 250 | + "print(\"\\nreview:\\n\", review) # noqa: T201" |
| 251 | + ] |
| 252 | + }, |
| 253 | + { |
| 254 | + "cell_type": "markdown", |
| 255 | + "metadata": {}, |
| 256 | + "source": [ |
| 257 | + "## 7. What to look for in the Openlayer trace\n", |
| 258 | + "\n", |
| 259 | + "Open your inference pipeline and click into each trace. You should see:\n", |
| 260 | + "\n", |
| 261 | + "**Scenario 1** — a single root `AGENT` step (`Claude Agent SDK query`) with assistant turn(s) and tool calls as children.\n", |
| 262 | + "\n", |
| 263 | + "**Scenario 2** — same root, plus a `TOOL` step for the MCP call (with `metadata.mcp_server` and `metadata.mcp_tool_name`) and a nested `AGENT` step named `Agent: code-reviewer` containing the subagent's own chat completions and tool steps.\n", |
| 264 | + "\n", |
| 265 | + "**Scenario 3** — one outer `codebase-audit` AGENT step, with two nested `Claude Agent SDK query` AGENT steps inside it (one per phase), and the review phase contains its own `Agent: code-reviewer` nested step.\n", |
| 266 | + "\n", |
| 267 | + "Click any `AGENT` step to see `system_prompt`, `agent_config`, `agents_defined`, `options`, and the raw `ResultMessage`. Click any `CHAT_COMPLETION` step for per-turn model, prompt/completion tokens, thinking content, and raw assistant message. Click any `TOOL` step for input, output, latency, and the originating `tool_use_id`." |
| 268 | + ] |
| 269 | + } |
| 270 | + ], |
| 271 | + "metadata": { |
| 272 | + "kernelspec": { |
| 273 | + "display_name": "Python 3", |
| 274 | + "language": "python", |
| 275 | + "name": "python3" |
| 276 | + }, |
| 277 | + "language_info": { |
| 278 | + "name": "python", |
| 279 | + "version": "3.10" |
| 280 | + } |
| 281 | + }, |
| 282 | + "nbformat": 4, |
| 283 | + "nbformat_minor": 4 |
| 284 | +} |
0 commit comments