docs: PR 4 plan — pydantic-ai migration onto the unified harness surface#413
Conversation
d21c54a to
ebc468d
Compare
…ed harness surface PydanticAITurn (HarnessTurn) + usage normalization, reimplement stream_pydantic_ai_events on UnifiedEmitter, deprecate bespoke tracing handler, cross-channel conformance fixtures, and 3 integration test agents (sync/async/temporal). Depends on AGX1-373 (conformance equivalence) and AGX1-375 (public adk path). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
0b8f494 to
dbdbc8e
Compare
| assert tu.num_llm_calls == 2 | ||
| ``` | ||
|
|
||
| - [ ] **Step 3: Implement** `pydantic_ai_usage_to_turn_usage(usage, model) -> TurnUsage` mapping the verified RunUsage fields onto `TurnUsage` (`input_tokens`, `output_tokens`, `total_tokens`, `cached_input_tokens` if available, `num_llm_calls` ← `requests`). Use `getattr(usage, "<field>", None)` defensively so a version field rename degrades to `None` rather than crashing. Then implement `PydanticAITurn`: |
There was a problem hiding this comment.
The plan tells the implementer to populate TurnUsage.cached_input_tokens only from a same-named pydantic-ai field. Current pydantic-ai RunUsage exposes cache input usage as cache_read_tokens and cache_write_tokens, so following this mapping silently drops cached-token usage even when the run reports it. Please spell out the real mapping, such as combining the cache read/write token fields if that matches Agentex semantics, and add a fixture that asserts cache usage survives normalization.
Artifacts
- Contains supporting evidence from the run (text/x-python; charset=utf-8).
- Keeps the command output available without making the summary code-heavy.
Ran code and verified through T-Rex
Prompt To Fix With AI
This is a comment left during a code review.
Path: docs/superpowers/plans/2026-06-18-unified-harness-surface-pr4-pydantic-ai.md
Line: 109
Comment:
**Cache tokens dropped**
The plan tells the implementer to populate `TurnUsage.cached_input_tokens` only from a same-named pydantic-ai field. Current pydantic-ai `RunUsage` exposes cache input usage as `cache_read_tokens` and `cache_write_tokens`, so following this mapping silently drops cached-token usage even when the run reports it. Please spell out the real mapping, such as combining the cache read/write token fields if that matches Agentex semantics, and add a fixture that asserts cache usage survives normalization.
How can I resolve this? If you propose a fix, please make it concise.|
|
||
| - [ ] **Step 2: Run** it green against the current implementation. Commit the test alone: `test(pydantic-ai): characterize stream_pydantic_ai_events output`. | ||
|
|
||
| - [ ] **Step 3: Reimplement** `stream_pydantic_ai_events` to build a `PydanticAITurn` and call `UnifiedEmitter(task_id=task_id, trace_id=<resolved>, parent_span_id=<resolved>, streaming=<injected or None>).auto_send_turn(turn)`, returning `result.final_text`. Resolve `trace_id`/`parent_span_id` the same way the module does today (from the streaming/tracing context vars it already reads). Preserve the exact public signature and return type. |
There was a problem hiding this comment.
This step says to resolve trace_id and parent_span_id "the same way the module does today", but _pydantic_ai_async.py currently does not read any tracing context vars. Current callers pass those values through tracing_handler; if the implementation preserves the signature but builds UnifiedEmitter with no actual trace context, default tracing is disabled and existing callers that still pass tracing_handler silently lose their tool spans. Please specify how to bridge the preserved handler into the emitter, or add explicit context access, with a regression test for an existing caller that passes a handler.
Artifacts
- Contains supporting evidence from the run (text/x-python; charset=utf-8).
Repro: harness output showing absent context-var reads and handler-sourced tool span tracing
- Keeps the command output available without making the summary code-heavy.
Ran code and verified through T-Rex
Prompt To Fix With AI
This is a comment left during a code review.
Path: docs/superpowers/plans/2026-06-18-unified-harness-surface-pr4-pydantic-ai.md
Line: 156
Comment:
**Tracing context has no source**
This step says to resolve `trace_id` and `parent_span_id` "the same way the module does today", but `_pydantic_ai_async.py` currently does not read any tracing context vars. Current callers pass those values through `tracing_handler`; if the implementation preserves the signature but builds `UnifiedEmitter` with no actual trace context, default tracing is disabled and existing callers that still pass `tracing_handler` silently lose their tool spans. Please specify how to bridge the preserved handler into the emitter, or add explicit context access, with a regression test for an existing caller that passes a handler.
How can I resolve this? If you propose a fix, please make it concise.
Implementation plan for PR 4 (pydantic-ai migration) of the unified harness surface workstream. Stacked on the foundation branch (#412) so the diff is just the plan doc.
Covers:
PydanticAITurn(HarnessTurn) + usage normalization, reimplementingstream_pydantic_ai_eventsonUnifiedEmitter(default tracing, same public signature), deprecating the bespoke_pydantic_ai_tracinghandler, cross-channel conformance fixtures, and the 3 integration test agents (sync/async/temporal) underexamples/.Dependencies called out in the plan: AGX1-373 (cross-channel conformance — in progress) and AGX1-375 (public
adkimport path). This plan is the template PR 5 (langgraph) and PR 6 (openai) will follow.🤖 Generated with Claude Code
Greptile Summary
PydanticAITurn, usage normalization,UnifiedEmitterstreaming, tracing deprecation, conformance fixtures, integration test agents, and CI wiring.adkimport path.Confidence Score: 4/5
The PR is documentation-only, but the implementation plan contains two concrete migration instructions that would cause behavior regressions if followed as written.
The changed file is narrow and the issues are localized to usage normalization and tracing migration guidance, with runtime checks confirming both documented assumptions conflict with the current code/dependency behavior.
docs/superpowers/plans/2026-06-18-unified-harness-surface-pr4-pydantic-ai.md
What T-Rex did
Prompt To Fix All With AI
Reviews (2): Last reviewed commit: "docs: PR 4 implementation plan — pydanti..." | Re-trigger Greptile