Skip to content

Title LiteLlm: Azure OpenAI file_id with assistant- prefix bypasses Responses-API content block #5664

@yparwani

Description

@yparwani

🔴 Required Information

Describe the Bug:

google.adk.models.lite_llm._looks_like_openai_file_id recognizes only the file- prefix as a valid OpenAI/Azure file id. Azure OpenAI's Files API actually returns ids prefixed with assistant- for purpose="assistants" uploads, and those ids are legitimately accepted by the Azure Responses API in the {"type":"file","file":{"file_id":"..."}} content block.

Because the prefix gate rejects assistant-..., ADK's content builder skips the Responses-API branch in _get_content (lines ~1095-1104 of models/lite_llm.py) and falls through to a later branch that emits a non-conforming content part of shape {"type":"file","file_uri":"assistant-...","mime_type":"application/pdf"}. Azure ignores this shape.

The same gate is referenced from _is_unsupported_file_uri and the redact helper (lines ~302 and ~322), so the impact is consistent: any caller that legitimately uploads to Azure OpenAI Files with purpose="assistants" (the standard purpose for PDF attachments to the Responses API) cannot send those files through ADK's LiteLlm

Steps to Reproduce:

  1. pip install google-adk litellm google-genai
  2. Upload a PDF to Azure OpenAI Files with purpose="assistants" via LiteLLM (litellm.acreate_file(file=..., purpose="assistants", custom_llm_provider="azure", api_key=..., api_base=...)). The returned id begins with assistant- (e.g. assistant-934058239058234095834).
  3. Build a genai_types.Part(file_data=FileData(file_uri="assistant-...", mime_type="application/pdf")) and feed it into an LlmRequest whose model is azure/<deployment>.
  4. Invoke LiteLlm and inspect the OpenAI-compatible message body LiteLLM sends to Azure (e.g. litellm.set_verbose = True or capture the messages list via a before_model_callback).

Expected Behavior:

The file content block matches the Azure Responses API contract:

{"type": "file", "file": {"file_id": "assistant-8b6ytv2iLxqCWtaMcxHSms"}}

i.e. assistant- ids are treated equivalently to file- ids when provider in {"openai", "azure"}.

Observed Behavior:

The content block is emitted as:

{"type": "file", "file_uri": "assistant-8b6ytv2iLxqCWtaMcxHSms", "mime_type": "application/pdf"}

file_uri and mime_type are not documented keys on a file-typed OpenAI/Azure content part. Azure rejects/ignores this attachment shape and the model never receives the file.

Environment Details:

  • ADK Library Version (pip show google-adk): 1.32.0
  • Desktop OS: macOS (also reproduces on Linux)
  • Python Version (python -V): 3.12

Model Information:

  • Are you using LiteLLM: Yes
  • Which model is being used: azure/gpt-5.4-mini (Azure OpenAI Responses API)

🟡 Optional Information

Regression:

N/A — _looks_like_openai_file_id has only ever accepted the file- prefix in releases I've inspected. This is a missing-case bug rather than a regression.

Logs:

N/A

Relevant ADK source (google/adk/models/lite_llm.py, ADK 1.32.0):

# line 282
def _looks_like_openai_file_id(file_uri: str) -> bool:
    """Returns True when file_uri resembles an OpenAI/Azure file id."""
    return file_uri.startswith("file-")

# line 1095
elif part.file_data and part.file_data.file_uri:
    if (
        provider in _FILE_ID_REQUIRED_PROVIDERS
        and _looks_like_openai_file_id(part.file_data.file_uri)
    ):
        content_objects.append({
            "type": "file",
            "file": {"file_id": part.file_data.file_uri},
        })
        continue
    # ... falls through to a non-Responses-API shape for assistant- ids

Screenshots / Video:

N/A.

Additional Context:

  • Azure OpenAI Files API behavior: uploads issued with purpose="assistants" (the only purpose value LiteLLM 1.83.x's acreate_file accepts that maps to a PDF attachment for the Responses API) return ids prefixed with assistant-. These ids are accepted by the Responses API {"type":"file","file":{"file_id":...}} content block.
  • Suggested fix: broaden the prefix check to file_uri.startswith(("file-", "assistant-")). All three internal call sites use the bare module-level name, so a one-line fix in _looks_like_openai_file_id propagates to _is_unsupported_file_uri, the redact helper, and the _get_content Responses-API branch.
  • Related observation: even if a caller wanted to inject the file id explicitly via extra_body, LlmRequest.config (a google.genai.types.GenerateContentConfig) does not expose an extra_body field, so the only correct paths today go through Part.file_data.file_uri → ADK's content builder. That makes the prefix gate the only thing standing between the caller and a correct payload.

Minimal Reproduction Code:

import asyncio
from google.adk.models import lite_llm as adk_lite_llm
from google.genai import types as genai_types

parts = [
    genai_types.Part(
        file_data=genai_types.FileData(
            file_uri="assistant-3402983204598",  # real shape from Azure OpenAI Files
            mime_type="application/pdf",
        )
    ),
]

content = asyncio.run(
    adk_lite_llm._get_content(parts, provider="azure", model="azure/gpt-5.4-mini")
)
print(content)
# Expected: [{"type": "file", "file": {"file_id": "assistant-..."}}]
# Actual:   [{"type": "file", "file_uri": "assistant-...", "mime_type": "application/pdf"}]

How often has this issue occurred?:

  • Always (100%) — deterministic for any Azure OpenAI Files upload with purpose="assistants".

Metadata

Metadata

Assignees

Labels

models[Component] Issues related to model support

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions