Skip to content

docs: PR 4 plan — pydantic-ai migration onto the unified harness surface#413

Open
declan-scale wants to merge 1 commit into
declan-scale/unified-harness-surfacefrom
declan-scale/pr4-pydantic-migration-plan
Open

docs: PR 4 plan — pydantic-ai migration onto the unified harness surface#413
declan-scale wants to merge 1 commit into
declan-scale/unified-harness-surfacefrom
declan-scale/pr4-pydantic-migration-plan

Conversation

@declan-scale

@declan-scale declan-scale commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Implementation plan for PR 4 (pydantic-ai migration) of the unified harness surface workstream. Stacked on the foundation branch (#412) so the diff is just the plan doc.

Covers: PydanticAITurn (HarnessTurn) + usage normalization, reimplementing stream_pydantic_ai_events on UnifiedEmitter (default tracing, same public signature), deprecating the bespoke _pydantic_ai_tracing handler, cross-channel conformance fixtures, and the 3 integration test agents (sync/async/temporal) under examples/.

Dependencies called out in the plan: AGX1-373 (cross-channel conformance — in progress) and AGX1-375 (public adk import path). This plan is the template PR 5 (langgraph) and PR 6 (openai) will follow.

🤖 Generated with Claude Code

Greptile Summary

  • Adds a PR 4 implementation plan for migrating pydantic-ai onto the unified harness surface.
  • Defines planned work for PydanticAITurn, usage normalization, UnifiedEmitter streaming, tracing deprecation, conformance fixtures, integration test agents, and CI wiring.
  • Calls out dependencies on cross-channel conformance and the public adk import path.

Confidence Score: 4/5

The PR is documentation-only, but the implementation plan contains two concrete migration instructions that would cause behavior regressions if followed as written.

The changed file is narrow and the issues are localized to usage normalization and tracing migration guidance, with runtime checks confirming both documented assumptions conflict with the current code/dependency behavior.

docs/superpowers/plans/2026-06-18-unified-harness-surface-pr4-pydantic-ai.md

T-Rex T-Rex Logs

What T-Rex did

  • I reproduced the cache-token normalization behavior by running a local Python repro that constructs RunUsage from pydantic-ai-slim 1.107.0 and shows that cache_read_tokens and cache_write_tokens exist while cached_input_tokens is absent.
  • I exercised a focused harness around the pydantic_ai_async streamer and observed that the streaming module does not read a contextvar or trace_id in AST, while the tracing handler path creates a span from the supplied trace_id and parent_id.
  • I reviewed the plan validation artifacts and confirmed the before state showed a base plan path issue, and the after state shows head validation, file presence, changed-file listing, and PASS results with document excerpts proving the plan entries exist.

View all artifacts

T-Rex Ran code and verified through T-Rex

Fix All in Claude Code

Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
docs/superpowers/plans/2026-06-18-unified-harness-surface-pr4-pydantic-ai.md:109
**Cache tokens dropped**

The plan tells the implementer to populate `TurnUsage.cached_input_tokens` only from a same-named pydantic-ai field. Current pydantic-ai `RunUsage` exposes cache input usage as `cache_read_tokens` and `cache_write_tokens`, so following this mapping silently drops cached-token usage even when the run reports it. Please spell out the real mapping, such as combining the cache read/write token fields if that matches Agentex semantics, and add a fixture that asserts cache usage survives normalization.

### Issue 2 of 2
docs/superpowers/plans/2026-06-18-unified-harness-surface-pr4-pydantic-ai.md:156
**Tracing context has no source**

This step says to resolve `trace_id` and `parent_span_id` "the same way the module does today", but `_pydantic_ai_async.py` currently does not read any tracing context vars. Current callers pass those values through `tracing_handler`; if the implementation preserves the signature but builds `UnifiedEmitter` with no actual trace context, default tracing is disabled and existing callers that still pass `tracing_handler` silently lose their tool spans. Please specify how to bridge the preserved handler into the emitter, or add explicit context access, with a regression test for an existing caller that passes a handler.

Reviews (2): Last reviewed commit: "docs: PR 4 implementation plan — pydanti..." | Re-trigger Greptile

Greptile also left 2 inline comments on this PR.

@declan-scale declan-scale force-pushed the declan-scale/unified-harness-surface branch from d21c54a to ebc468d Compare June 18, 2026 17:29
…ed harness surface

PydanticAITurn (HarnessTurn) + usage normalization, reimplement
stream_pydantic_ai_events on UnifiedEmitter, deprecate bespoke tracing handler,
cross-channel conformance fixtures, and 3 integration test agents
(sync/async/temporal). Depends on AGX1-373 (conformance equivalence) and
AGX1-375 (public adk path).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@declan-scale declan-scale force-pushed the declan-scale/pr4-pydantic-migration-plan branch from 0b8f494 to dbdbc8e Compare June 18, 2026 17:32
assert tu.num_llm_calls == 2
```

- [ ] **Step 3: Implement** `pydantic_ai_usage_to_turn_usage(usage, model) -> TurnUsage` mapping the verified RunUsage fields onto `TurnUsage` (`input_tokens`, `output_tokens`, `total_tokens`, `cached_input_tokens` if available, `num_llm_calls` ← `requests`). Use `getattr(usage, "<field>", None)` defensively so a version field rename degrades to `None` rather than crashing. Then implement `PydanticAITurn`:

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Cache tokens dropped

The plan tells the implementer to populate TurnUsage.cached_input_tokens only from a same-named pydantic-ai field. Current pydantic-ai RunUsage exposes cache input usage as cache_read_tokens and cache_write_tokens, so following this mapping silently drops cached-token usage even when the run reports it. Please spell out the real mapping, such as combining the cache read/write token fields if that matches Agentex semantics, and add a fixture that asserts cache usage survives normalization.

Artifacts

Repro: Python script constructing pydantic-ai RunUsage and applying same-name-only cache-token normalization

  • Contains supporting evidence from the run (text/x-python; charset=utf-8).

Repro: failing script output showing cache_read_tokens and cache_write_tokens are dropped by cached_input_tokens-only mapping

  • Keeps the command output available without making the summary code-heavy.

View artifacts

T-Rex Ran code and verified through T-Rex

Prompt To Fix With AI
This is a comment left during a code review.
Path: docs/superpowers/plans/2026-06-18-unified-harness-surface-pr4-pydantic-ai.md
Line: 109

Comment:
**Cache tokens dropped**

The plan tells the implementer to populate `TurnUsage.cached_input_tokens` only from a same-named pydantic-ai field. Current pydantic-ai `RunUsage` exposes cache input usage as `cache_read_tokens` and `cache_write_tokens`, so following this mapping silently drops cached-token usage even when the run reports it. Please spell out the real mapping, such as combining the cache read/write token fields if that matches Agentex semantics, and add a fixture that asserts cache usage survives normalization.

How can I resolve this? If you propose a fix, please make it concise.

Fix in Claude Code


- [ ] **Step 2: Run** it green against the current implementation. Commit the test alone: `test(pydantic-ai): characterize stream_pydantic_ai_events output`.

- [ ] **Step 3: Reimplement** `stream_pydantic_ai_events` to build a `PydanticAITurn` and call `UnifiedEmitter(task_id=task_id, trace_id=<resolved>, parent_span_id=<resolved>, streaming=<injected or None>).auto_send_turn(turn)`, returning `result.final_text`. Resolve `trace_id`/`parent_span_id` the same way the module does today (from the streaming/tracing context vars it already reads). Preserve the exact public signature and return type.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Tracing context has no source

This step says to resolve trace_id and parent_span_id "the same way the module does today", but _pydantic_ai_async.py currently does not read any tracing context vars. Current callers pass those values through tracing_handler; if the implementation preserves the signature but builds UnifiedEmitter with no actual trace context, default tracing is disabled and existing callers that still pass tracing_handler silently lose their tool spans. Please specify how to bridge the preserved handler into the emitter, or add explicit context access, with a regression test for an existing caller that passes a handler.

Artifacts

Repro: focused harness that imports the pydantic-ai async streamer and exercises handler-derived tracing context

  • Contains supporting evidence from the run (text/x-python; charset=utf-8).

Repro: harness output showing absent context-var reads and handler-sourced tool span tracing

  • Keeps the command output available without making the summary code-heavy.

View artifacts

T-Rex Ran code and verified through T-Rex

Prompt To Fix With AI
This is a comment left during a code review.
Path: docs/superpowers/plans/2026-06-18-unified-harness-surface-pr4-pydantic-ai.md
Line: 156

Comment:
**Tracing context has no source**

This step says to resolve `trace_id` and `parent_span_id` "the same way the module does today", but `_pydantic_ai_async.py` currently does not read any tracing context vars. Current callers pass those values through `tracing_handler`; if the implementation preserves the signature but builds `UnifiedEmitter` with no actual trace context, default tracing is disabled and existing callers that still pass `tracing_handler` silently lose their tool spans. Please specify how to bridge the preserved handler into the emitter, or add explicit context access, with a regression test for an existing caller that passes a handler.

How can I resolve this? If you propose a fix, please make it concise.

Fix in Claude Code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant