Type the agent NDJSON events and HF /splits parsing with Pydantic#157
Merged
Conversation
Two stringly-typed boundaries now carry typed models, following the
auth/flow.py + account.py precedent:
- agent/events.py: the `assembly agent --json` event stream was hand-built
`{"type": …}` dicts the type checker only saw as `dict[str, str]`. Each
event is now a closed, frozen Pydantic model whose `type` literal and payload
are pinned at type-check time, so a renamed key or mistyped `type` can't drift
onto the wire. AgentRenderer emits via `model_dump()`; the wire shapes are
byte-identical (existing render goldens unchanged).
- evaluate/_hf_api.py: `split_entries()` returned bare dicts and subset/split
selection read `str(entry.get("config"))`. A `_SplitEntry` model (validated
via a module-level TypeAdapter) gives `pick_subset`/`pick_split` typed fields
and turns a malformed /splits payload into a clean APIError instead of a
stringified "None".
https://claude.ai/code/session_01AzkXsmPQSoUJjPgJY6qvGB
Extends the agent-events pattern to `assembly stream --json`. The begin/turn/ termination records were hand-built dicts assembled with `jsonshape.compact` + `_with_source`; each is now a closed, frozen Pydantic model whose `type` literal and payload are pinned at type-check time. The two presence rules are preserved in a shared `wire()`: the optional annotations `source` (parallel system/you streams) and `speaker` (--speaker-labels diarization) drop out of the record when absent, while the core payload (`id`, `audio_duration_seconds`) stays present even when null. The `id` field is `session_id` in Python with a serialization_alias so the model stays self-contained (no flake8-builtins A003 carve-out). Wire output is byte-identical; the existing render goldens are unchanged. https://claude.ai/code/session_01AzkXsmPQSoUJjPgJY6qvGB
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Two stringly-typed boundaries now carry typed models, following the
auth/flow.py + account.py precedent:
agent/events.py: the
assembly agent --jsonevent stream was hand-built{"type": …}dicts the type checker only saw asdict[str, str]. Eachevent is now a closed, frozen Pydantic model whose
typeliteral and payloadare pinned at type-check time, so a renamed key or mistyped
typecan't driftonto the wire. AgentRenderer emits via
model_dump(); the wire shapes arebyte-identical (existing render goldens unchanged).
evaluate/_hf_api.py:
split_entries()returned bare dicts and subset/splitselection read
str(entry.get("config")). A_SplitEntrymodel (validatedvia a module-level TypeAdapter) gives
pick_subset/pick_splittyped fieldsand turns a malformed /splits payload into a clean APIError instead of a
stringified "None".
https://claude.ai/code/session_01AzkXsmPQSoUJjPgJY6qvGB