docs: PR 4 plan — pydantic-ai migration onto the unified harness surface by declan-scale · Pull Request #413 · scaleapi/scale-agentex-python

declan-scale · 2026-06-18T17:19:43Z

Implementation plan for PR 4 (pydantic-ai migration) of the unified harness surface workstream. Stacked on the foundation branch (#412) so the diff is just the plan doc.

Covers: PydanticAITurn (HarnessTurn) + usage normalization, reimplementing stream_pydantic_ai_events on UnifiedEmitter (default tracing, same public signature), deprecating the bespoke _pydantic_ai_tracing handler, cross-channel conformance fixtures, and the 3 integration test agents (sync/async/temporal) under examples/.

Dependencies called out in the plan: AGX1-373 (cross-channel conformance — in progress) and AGX1-375 (public adk import path). This plan is the template PR 5 (langgraph) and PR 6 (openai) will follow.

🤖 Generated with Claude Code

Greptile Summary

Adds a PR 4 implementation plan for migrating pydantic-ai onto the unified harness surface.
Defines planned work for PydanticAITurn, usage normalization, UnifiedEmitter streaming, tracing deprecation, conformance fixtures, integration test agents, and CI wiring.
Calls out dependencies on cross-channel conformance and the public adk import path.

Confidence Score: 4/5

The PR is documentation-only, but the implementation plan contains two concrete migration instructions that would cause behavior regressions if followed as written.

The changed file is narrow and the issues are localized to usage normalization and tracing migration guidance, with runtime checks confirming both documented assumptions conflict with the current code/dependency behavior.

docs/superpowers/plans/2026-06-18-unified-harness-surface-pr4-pydantic-ai.md

T-Rex Logs

What T-Rex did

I reproduced the cache-token normalization behavior by running a local Python repro that constructs RunUsage from pydantic-ai-slim 1.107.0 and shows that cache_read_tokens and cache_write_tokens exist while cached_input_tokens is absent.
I exercised a focused harness around the pydantic_ai_async streamer and observed that the streaming module does not read a contextvar or trace_id in AST, while the tracing handler path creates a span from the supplied trace_id and parent_id.
I reviewed the plan validation artifacts and confirmed the before state showed a base plan path issue, and the after state shows head validation, file presence, changed-file listing, and PASS results with document excerpts proving the plan entries exist.

_{Ran code and verified through T-Rex}

Prompt To Fix All With AI

Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
docs/superpowers/plans/2026-06-18-unified-harness-surface-pr4-pydantic-ai.md:109
**Cache tokens dropped**

The plan tells the implementer to populate `TurnUsage.cached_input_tokens` only from a same-named pydantic-ai field. Current pydantic-ai `RunUsage` exposes cache input usage as `cache_read_tokens` and `cache_write_tokens`, so following this mapping silently drops cached-token usage even when the run reports it. Please spell out the real mapping, such as combining the cache read/write token fields if that matches Agentex semantics, and add a fixture that asserts cache usage survives normalization.

### Issue 2 of 2
docs/superpowers/plans/2026-06-18-unified-harness-surface-pr4-pydantic-ai.md:156
**Tracing context has no source**

This step says to resolve `trace_id` and `parent_span_id` "the same way the module does today", but `_pydantic_ai_async.py` currently does not read any tracing context vars. Current callers pass those values through `tracing_handler`; if the implementation preserves the signature but builds `UnifiedEmitter` with no actual trace context, default tracing is disabled and existing callers that still pass `tracing_handler` silently lose their tool spans. Please specify how to bridge the preserved handler into the emitter, or add explicit context access, with a regression test for an existing caller that passes a handler.

_{Reviews (2): Last reviewed commit: "docs: PR 4 implementation plan — pydanti..." | Re-trigger Greptile}

Greptile also left 2 inline comments on this PR.

…ed harness surface PydanticAITurn (HarnessTurn) + usage normalization, reimplement stream_pydantic_ai_events on UnifiedEmitter, deprecate bespoke tracing handler, cross-channel conformance fixtures, and 3 integration test agents (sync/async/temporal). Depends on AGX1-373 (conformance equivalence) and AGX1-375 (public adk path). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

greptile-apps · 2026-06-18T17:39:09Z

+    assert tu.num_llm_calls == 2
+```
+
+- [ ] **Step 3: Implement** `pydantic_ai_usage_to_turn_usage(usage, model) -> TurnUsage` mapping the verified RunUsage fields onto `TurnUsage` (`input_tokens`, `output_tokens`, `total_tokens`, `cached_input_tokens` if available, `num_llm_calls` ← `requests`). Use `getattr(usage, "<field>", None)` defensively so a version field rename degrades to `None` rather than crashing. Then implement `PydanticAITurn`:


Cache tokens dropped

The plan tells the implementer to populate TurnUsage.cached_input_tokens only from a same-named pydantic-ai field. Current pydantic-ai RunUsage exposes cache input usage as cache_read_tokens and cache_write_tokens, so following this mapping silently drops cached-token usage even when the run reports it. Please spell out the real mapping, such as combining the cache read/write token fields if that matches Agentex semantics, and add a fixture that asserts cache usage survives normalization.

Artifacts

Repro: Python script constructing pydantic-ai RunUsage and applying same-name-only cache-token normalization

Contains supporting evidence from the run (text/x-python; charset=utf-8).

Repro: failing script output showing cache_read_tokens and cache_write_tokens are dropped by cached_input_tokens-only mapping

Keeps the command output available without making the summary code-heavy.

_{Ran code and verified through T-Rex}

Prompt To Fix With AI

This is a comment left during a code review. Path: docs/superpowers/plans/2026-06-18-unified-harness-surface-pr4-pydantic-ai.md Line: 109 Comment: **Cache tokens dropped** The plan tells the implementer to populate `TurnUsage.cached_input_tokens` only from a same-named pydantic-ai field. Current pydantic-ai `RunUsage` exposes cache input usage as `cache_read_tokens` and `cache_write_tokens`, so following this mapping silently drops cached-token usage even when the run reports it. Please spell out the real mapping, such as combining the cache read/write token fields if that matches Agentex semantics, and add a fixture that asserts cache usage survives normalization. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-06-18T17:39:10Z

+
+- [ ] **Step 2: Run** it green against the current implementation. Commit the test alone: `test(pydantic-ai): characterize stream_pydantic_ai_events output`.
+
+- [ ] **Step 3: Reimplement** `stream_pydantic_ai_events` to build a `PydanticAITurn` and call `UnifiedEmitter(task_id=task_id, trace_id=<resolved>, parent_span_id=<resolved>, streaming=<injected or None>).auto_send_turn(turn)`, returning `result.final_text`. Resolve `trace_id`/`parent_span_id` the same way the module does today (from the streaming/tracing context vars it already reads). Preserve the exact public signature and return type.


Tracing context has no source

This step says to resolve trace_id and parent_span_id "the same way the module does today", but _pydantic_ai_async.py currently does not read any tracing context vars. Current callers pass those values through tracing_handler; if the implementation preserves the signature but builds UnifiedEmitter with no actual trace context, default tracing is disabled and existing callers that still pass tracing_handler silently lose their tool spans. Please specify how to bridge the preserved handler into the emitter, or add explicit context access, with a regression test for an existing caller that passes a handler.

Artifacts

Repro: focused harness that imports the pydantic-ai async streamer and exercises handler-derived tracing context

Contains supporting evidence from the run (text/x-python; charset=utf-8).

Repro: harness output showing absent context-var reads and handler-sourced tool span tracing

Keeps the command output available without making the summary code-heavy.

_{Ran code and verified through T-Rex}

Prompt To Fix With AI

This is a comment left during a code review. Path: docs/superpowers/plans/2026-06-18-unified-harness-surface-pr4-pydantic-ai.md Line: 156 Comment: **Tracing context has no source** This step says to resolve `trace_id` and `parent_span_id` "the same way the module does today", but `_pydantic_ai_async.py` currently does not read any tracing context vars. Current callers pass those values through `tracing_handler`; if the implementation preserves the signature but builds `UnifiedEmitter` with no actual trace context, default tracing is disabled and existing callers that still pass `tracing_handler` silently lose their tool spans. Please specify how to bridge the preserved handler into the emitter, or add explicit context access, with a regression test for an existing caller that passes a handler. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps Bot reviewed Jun 18, 2026

View reviewed changes

declan-scale force-pushed the declan-scale/unified-harness-surface branch from d21c54a to ebc468d Compare June 18, 2026 17:29

declan-scale force-pushed the declan-scale/pr4-pydantic-migration-plan branch from 0b8f494 to dbdbc8e Compare June 18, 2026 17:32

greptile-apps Bot reviewed Jun 18, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: PR 4 plan — pydantic-ai migration onto the unified harness surface#413

docs: PR 4 plan — pydantic-ai migration onto the unified harness surface#413
declan-scale wants to merge 1 commit into
declan-scale/unified-harness-surfacefrom
declan-scale/pr4-pydantic-migration-plan

declan-scale commented Jun 18, 2026 •

edited by greptile-apps Bot

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

greptile-apps Bot Jun 18, 2026

Uh oh!

greptile-apps Bot Jun 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant


		- [ ] Step 2: Run it green against the current implementation. Commit the test alone: `test(pydantic-ai): characterize stream_pydantic_ai_events output`.

		- [ ] Step 3: Reimplement `stream_pydantic_ai_events` to build a `PydanticAITurn` and call `UnifiedEmitter(task_id=task_id, trace_id=<resolved>, parent_span_id=<resolved>, streaming=<injected or None>).auto_send_turn(turn)`, returning `result.final_text`. Resolve `trace_id`/`parent_span_id` the same way the module does today (from the streaming/tracing context vars it already reads). Preserve the exact public signature and return type.

Conversation

declan-scale commented Jun 18, 2026 • edited by greptile-apps Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

T-Rex Logs

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

greptile-apps Bot Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

declan-scale commented Jun 18, 2026 •

edited by greptile-apps Bot

Loading