fix: resolve benchmark prompt assets without asserting concrete asset types by RapidPoseidon · Pull Request #625 · RapidataAI/rapidata-python-sdk

RapidPoseidon · 2026-06-18T08:52:09Z

Problem

RapidataBenchmark.identifiers (and prompts / prompt_assets / tags) crashes on any benchmark whose prompts have file assets — which blocks add_model and add_prompts, since both read self.identifiers internally.

__instantiate_prompts did:

assert isinstance(prompt.prompt_asset.actual_instance, FileAssetModel)
source_url = prompt.prompt_asset.actual_instance.metadata["sourceUrl"].actual_instance
assert isinstance(source_url, SourceUrlMetadataModel)

Neither assumption holds against the current API response:

prompt.prompt_asset is an IAssetModel oneOf wrapper; its .actual_instance is an IAssetModel* variant (IAssetModelFileAssetModel / …MultiAssetModel / …TextAssetModel / …NullAssetModel) — never the bare FileAssetModel. → AssertionError.
metadata is Dict[str, IMetadataModel] and only carries a "sourceUrl" entry for URL-registered assets. File-uploaded prompts only have originalFilename / imageDimension, so metadata["sourceUrl"] would KeyError.
The metadata variant is IMetadataModelSourceUrlMetadataModel, not the bare SourceUrlMetadataModel, so the second assert is wrong too.

Reproduced live against a real file-asset benchmark (499 prompts): benchmark.identifiers raised AssertionError before this change.

Fix

Resolve the source URL defensively instead of asserting concrete types, mirroring the existing pattern in RapidataFlowItem._extract_asset_key():

@staticmethod
def __source_url(asset):
    instance = getattr(asset, "actual_instance", None)
    metadata = getattr(instance, "metadata", None)
    if not metadata:
        return None
    source_url = metadata.get("sourceUrl")
    actual = getattr(source_url, "actual_instance", None)
    return getattr(actual, "url", None)

prompt_assets now yields the source URL when present, else None for file-uploaded assets that have no public URL in the response — the old code could never have returned a URL for those anyway; it crashed.

Verification

Live against the real benchmark (read-only, creates nothing): identifiers, prompts, prompt_assets, tags all populate (499 each) with no exception; add_model's identifier-membership check passes.
uv run black src/rapidata/rapidata_client → unchanged.
uv run pyright src/rapidata/rapidata_client/benchmark/rapidata_benchmark.py → 0 errors, 0 warnings.

Notes

No version bump — versioning/publishing is CI-automated (release_and_publish.yml).
No test added — the repo has no test framework yet; a regression test for this (IAssetModel* wrapper + missing sourceUrl) would be worth adding once test infra exists.
Discovered while adding participants to a mock benchmark in Image-Editing-Benchmarks (RapidataAI/Image-Editing-Benchmarks#6), which currently ships a consumer-side cache-injection workaround that can be removed once this releases.

🔗 Session: ${POSEIDON_SESSION_URL:-$HOSTNAME}

🤖 Generated with Claude Code

… types RapidataBenchmark.__instantiate_prompts asserted that each prompt's prompt_asset.actual_instance was a FileAssetModel and then indexed metadata["sourceUrl"]. Neither holds for the current API response: the prompts endpoint returns an IAssetModel oneOf wrapper whose actual_instance is an IAssetModel* variant (file/multi/text/null), and only URL-registered assets carry a "sourceUrl" metadata entry. File-uploaded prompts only have originalFilename/imageDimension, so the assert raised AssertionError (and the index would otherwise KeyError) on any benchmark with file assets. Because identifiers/prompts/prompt_assets/tags all lazily call this method, and add_model/add_prompts read self.identifiers, the crash blocked the whole participant-adding flow. Resolve the source URL defensively (mirroring RapidataFlowItem._extract_asset_key): navigate actual_instance -> metadata -> "sourceUrl" with getattr/.get and fall back to None when no URL is present, rather than asserting concrete types. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-Authored-By: karl <karl@rapidata.ai>

claude · 2026-06-18T08:54:07Z

Code Review

Overview

This PR fixes a real crash in RapidataBenchmark.__instantiate_prompts when processing benchmarks whose prompts have file assets. The old code made two incorrect assumptions about the auto-generated API client's response structure — asserting bare FileAssetModel/SourceUrlMetadataModel types that are actually wrapped in IAssetModel*/IMetadataModel* oneOf variants — causing an AssertionError on every file-asset benchmark and blocking add_model/add_prompts as a result. The fix is targeted and correct.

Correctness

The root cause is accurately diagnosed: the OpenAPI client wraps concrete types in oneOf holder objects (IAssetModel), so the old assert isinstance(..., FileAssetModel) could never pass.
Passing None to __source_url is handled gracefully — getattr(None, "actual_instance", None) returns None, so the explicit if prompt.prompt_asset is None branch is correctly collapsed into the single call.
The metadata.get("sourceUrl") call is safe after the if not metadata guard.
Returning None for file-uploaded assets (which have no sourceUrl in the response) preserves the pre-existing semantics — the old code would have crashed before ever returning anything for those cases anyway.

Code Quality

Minor style concern — docstring length on a private method. The __source_url docstring is 6 prose lines explaining the oneOf wrapper structure. CLAUDE.md asks to keep comments to a minimum and avoid multi-line blocks unless the why is non-obvious. Here the why is genuinely non-obvious (a subtle API contract), so some comment is warranted, but the current block is verbose for a private static helper. Consider condensing to one or two lines:

# IAssetModel is a oneOf wrapper; only URL-registered assets carry a "sourceUrl"
# metadata entry — navigate defensively rather than asserting concrete types.

That captures the essential constraint without repeating the full type hierarchy already visible in the surrounding code.

TYPE_CHECKING Import

IAssetModel is correctly placed under TYPE_CHECKING, consistent with the project convention from CLAUDE.md.

Tooling

PR description confirms:

uv run black — no changes
uv run pyright src/rapidata/rapidata_client/benchmark/rapidata_benchmark.py — 0 errors, 0 warnings

Potential Follow-ups (not blocking)

Multi-asset / text-asset prompt types — IAssetModelMultiAssetModel and IAssetModelTextAssetModel variants presumably also lack a sourceUrl; they will silently return None here. That is likely correct behavior, but worth a note so callers of prompt_assets know a None entry can mean "file-uploaded or non-URL asset type", not just "no asset".
Test coverage — the PR correctly notes the repo has no test framework yet. A regression test covering the IAssetModel oneOf wrapper + missing sourceUrl path would be a high-value first test once infra exists.

Summary

Clean, well-scoped fix. The defensive getattr navigation mirrors the existing pattern in RapidataFlowItem._extract_asset_key and is the right approach for working with auto-generated oneOf wrappers. The only non-blocking suggestion is trimming the docstring to match the project's comment style. Approved.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: resolve benchmark prompt assets without asserting concrete asset types#625

fix: resolve benchmark prompt assets without asserting concrete asset types#625
RapidPoseidon wants to merge 1 commit into
mainfrom
fix/benchmark-prompt-asset-resolution

RapidPoseidon commented Jun 18, 2026

Uh oh!

claude Bot commented Jun 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

RapidPoseidon commented Jun 18, 2026

Problem

Fix

Verification

Notes

Uh oh!

claude Bot commented Jun 18, 2026

Code Review

Overview

Correctness

Code Quality

TYPE_CHECKING Import

Tooling

Potential Follow-ups (not blocking)

Summary

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant