fix: improve truncation-aware parse failure logging#754
Conversation
Normalize Anthropic stop reasons into completion choices and prefer canonical finish_reason metadata when detecting max_tokens truncation. Add async scheduler coverage so dropped rows retain the actionable max_tokens guidance.
Linked Issue CheckIssue #411 has not been triaged yet. A maintainer needs to review You can continue working on the PR in the meantime. The check will |
Greptile SummaryThis PR surfaces actionable truncation guidance when a parse or recipe failure is caused by a model response that hit
|
| Filename | Overview |
|---|---|
| packages/data-designer-engine/src/data_designer/engine/models/facade.py | Adds _response_was_truncated_by_max_tokens helper and propagates truncated_by_max_tokens to all _build_generation_validation_error callsites in both sync and async generate loops; logic is correct and well-guarded. |
| packages/data-designer-engine/src/data_designer/engine/models/errors.py | Adds truncated_by_max_tokens field to both GenerationValidationFailureError and ModelGenerationValidationFailureError, and forks handle_llm_exceptions to emit a distinct truncation-specific message when the flag is set. |
| packages/data-designer-engine/src/data_designer/engine/models/clients/adapters/anthropic_translation.py | Populates choices[0].finish_reason from Anthropic stop_reason so the canonical truncation check in facade.py can detect max_tokens without falling back to the raw response dict. |
| packages/data-designer-engine/src/data_designer/engine/dataset_builders/dataset_builder.py | Appends a truncation-specific hint to the per-record drop warning when exc.truncated_by_max_tokens is truthy; uses getattr to stay safe against exception types that don't carry the attribute. |
| packages/data-designer-engine/tests/engine/models/test_facade.py | Adds parametrized tests for all six truncation-detection paths for both sync and async generate. |
| packages/data-designer-engine/tests/engine/models/clients/test_anthropic.py | Adds assertions that choices[0].finish_reason is populated for text, max_tokens, and tool_use stop reasons. |
| packages/data-designer-engine/tests/engine/models/test_model_errors.py | Adds regression for truncation-specific error message path and extends existing test to assert truncated_by_max_tokens=False on the non-truncated path. |
| packages/data-designer-engine/tests/engine/dataset_builders/test_dataset_builder.py | New test drives _worker_error_callback with a truncated-parse-failure exception and verifies the warning text contains both the truncation cause and the max_tokens remediation advice. |
| packages/data-designer-engine/tests/engine/dataset_builders/test_async_scheduler.py | Adds MockTruncatedParseFailureGenerator and an async scheduler test that confirms the row is dropped and the truncation guidance appears in the warning log. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[Model response received] --> B[ParserException raised]
B --> C[_response_was_truncated_by_max_tokens]
C --> D{canonical choices finish_reason is length or max_tokens?}
D -- Yes --> E[truncated = True]
D -- No --> F{raw response available?}
F -- No --> G[truncated = False]
F -- Yes --> H{raw choices finish_reason is length?}
H -- Yes --> E
H -- No --> I{raw stop_reason is max_tokens?}
I -- Yes --> E
I -- No --> G
E --> J[_build_generation_validation_error with truncated=True]
G --> K[_build_generation_validation_error with truncated=False]
J --> L[handle_llm_exceptions emits truncation-specific message]
K --> M[handle_llm_exceptions emits generic message]
L --> N[ModelGenerationValidationFailureError truncated=True]
M --> O[ModelGenerationValidationFailureError truncated=False]
N --> P[dataset_builder warning includes max_tokens hint]
O --> Q[dataset_builder warning generic]
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
flowchart TD
A[Model response received] --> B[ParserException raised]
B --> C[_response_was_truncated_by_max_tokens]
C --> D{canonical choices finish_reason is length or max_tokens?}
D -- Yes --> E[truncated = True]
D -- No --> F{raw response available?}
F -- No --> G[truncated = False]
F -- Yes --> H{raw choices finish_reason is length?}
H -- Yes --> E
H -- No --> I{raw stop_reason is max_tokens?}
I -- Yes --> E
I -- No --> G
E --> J[_build_generation_validation_error with truncated=True]
G --> K[_build_generation_validation_error with truncated=False]
J --> L[handle_llm_exceptions emits truncation-specific message]
K --> M[handle_llm_exceptions emits generic message]
L --> N[ModelGenerationValidationFailureError truncated=True]
M --> O[ModelGenerationValidationFailureError truncated=False]
N --> P[dataset_builder warning includes max_tokens hint]
O --> Q[dataset_builder warning generic]
Reviews (1): Last reviewed commit: "fix: use finish reasons for truncation g..." | Re-trigger Greptile
|
Thanks for the automated review. Greptile, DCO, semantic title, and the agentic CI gate are now passing. The only failing check is the linked-issue gate because #411 does not yet have the maintainer-added |
Summary
Fixes #411 by surfacing an actionable message when a parse or recipe failure follows a model response that ended because of max_tokens.
Changes
Validation
Attention Areas
Fixes #411