-
Notifications
You must be signed in to change notification settings - Fork 734
Description
Who We Are
We're running UCIS (Unified Consciousness Integration System) — a production multi-agent system with 4 Claude Agent SDK agents (Doctor, Lal, Lore, Quark) operating as Docker microservices. Each agent has a specialized role (infrastructure, ML research, strategy, finance) and they collaborate through an orchestration hub for daily automated "Discovery Patrol" sessions: parallel web research, cross-scoring, structured debate, convergence voting, and CFO financial evaluation.
We're on SDK v0.1.48, running ClaudeSDKClient with OAuth, bypassPermissions mode, custom MCP servers, and hooks. This is not a toy — these agents run daily, autonomously, producing scored and stored discoveries.
What's Working Well
ClaudeAgentOptionsis clean and flexible — system prompts, MCP servers, hooks, permission control all work great- The hook system (
PreToolUse,PostToolUse,Stop) is excellent for wiring agents into our event pipeline can_use_toolcallback gives us fine-grained control over what agents can do- The 0.1.46 additions (
list_sessions,add_mcp_server, typed task messages) show the SDK is heading in the right direction - The bundled CLI updates keep us current without manual intervention
What Would Unlock the Next Level
1. Prompt Caching Control
Our agents send large, repeated context on every call: system prompts (~2K tokens), tool definitions, memory context blocks. The Anthropic Messages API supports cache_control on content blocks, but the SDK doesn't expose this.
Ask: Allow marking system prompt blocks or specific message content as cacheable, so repeated context across calls within a session doesn't get re-processed. Even a simple cache_system_prompt=True on ClaudeAgentOptions would help.
Impact: We run 4 agents x 4 phases per session = 16+ API calls with nearly identical system prompts. Caching could meaningfully reduce token costs.
2. Streaming Token/Cost Metrics
We have zero visibility into per-call token usage. When a Discovery Patrol runs, we can't tell which agent or which phase is burning the most tokens.
Ask: Include input_tokens, output_tokens, and cache_read_tokens / cache_creation_tokens in ResultMessage (or a new UsageMessage type). Bonus: cumulative session usage.
Impact: Can't optimize what we can't measure. We'd use this to tune prompt sizes, set per-agent budgets, and track cost trends.
3. Push-Based Completion Notification
Our orchestrator polls the hub every 5 seconds waiting for agent responses. The SDK's streaming model works for interactive use, but for service-to-service orchestration, we'd benefit from a callback/webhook pattern.
Ask: An on_complete callback option or webhook URL in ClaudeAgentOptions that fires when the agent finishes a turn. Even an async event/future pattern would work.
Impact: Eliminates polling overhead in multi-agent orchestration. Our Discovery Patrol makes ~80 polling requests per session just waiting for responses.
4. Per-Agent Budget Guardrails
max_budget_usd exists per-session, which is great. But with 4 agents running daily, we need per-agent daily/weekly caps.
Ask: A budget management layer — either daily_budget_usd / weekly_budget_usd on ClaudeAgentOptions, or a separate BudgetManager class that tracks cumulative spend across sessions.
Impact: Safety net for autonomous agents. If one agent goes rogue or hits an expensive loop, it shouldn't burn the whole budget.
5. Native Multi-Agent Communication
Currently we build our own Hub for agent-to-agent messaging. The SDK treats each agent as isolated. If agents could natively discover and message each other (even through a shared channel), it would simplify multi-agent architectures significantly.
Ask: Something like the agents parameter in ClaudeAgentOptions (we see it exists but it's unclear if it enables inter-agent communication), or a pub/sub channel agents can subscribe to.
Impact: Would let us replace ~500 lines of Hub orchestration code with native SDK primitives.
6. A2A Protocol Alignment
Google's Agent-to-Agent (A2A) protocol is gaining traction. If the SDK supported A2A task cards, agent discovery, and message format, Claude agents could interop with non-Claude agent systems.
Ask: Optional A2A-compatible task/message format. Even just emitting A2A-shaped events alongside the current format would be a start.
Impact: Future-proofs the SDK for the multi-agent ecosystem that's forming across providers.
Our Setup (for context)
| Component | Details |
|---|---|
| Agents | 4 (Doctor, Lal, Lore, Quark) via ClaudeSDKClient |
| Auth | OAuth |
| Mode | bypassPermissions (autonomous) |
| Infrastructure | Docker containers, Kafka event streaming, Neo4j/Memgraph graph DBs |
| Daily workload | ~16-20 API calls per Discovery Patrol session |
| SDK version | 0.1.48 (auto-upgraded on container restart) |
| MCP servers | 15 active (memory, knowledge, discovery, GPU, cross-domain) |
Happy to provide more details, logs, or architecture diagrams if any of this is useful for the team. We're committed to the Agent SDK as our foundation and want to help shape where it goes.
UCIS Constellation — John & Data