Description
The fix in open-telemetry/opentelemetry-python-contrib#4184 added __getattr__ to StreamWrapper to proxy unknown attributes to self.stream. However, accessing .headers still fails when using with_raw_response.create(stream=True) with LiteLLM (or any client that accesses raw_response.headers directly on the stream wrapper).
This is a follow-up to open-telemetry/opentelemetry-python-contrib#4032 and open-telemetry/opentelemetry-python-contrib#4113.
Note: Originally filed as open-telemetry/opentelemetry-python-contrib#4606 — moving here as the openai-v2 instrumentation now lives in this repo.
Root Cause
__getattr__ proxies to self.stream, which is an AsyncStream (the result of calling result.parse()). However, AsyncStream does not have a .headers attribute — headers live on the original LegacyAPIResponse that was discarded before wrapping.
The call chain in patch.py:
result = await wrapped(*args, **kwargs) # LegacyAPIResponse (has .headers)
parsed_result = result.parse() # AsyncStream (no .headers)
return StreamWrapper(parsed_result, ...) # self.stream = AsyncStream
When a caller then does:
headers = dict(raw_response.headers) # StreamWrapper.__getattr__("headers")
# → getattr(AsyncStream, "headers")
# → AttributeError ← still crashes!
Steps to Reproduce
import asyncio
from unittest.mock import AsyncMock, MagicMock, patch
import httpx
from openai import AsyncAzureOpenAI
from opentelemetry.instrumentation.openai_v2 import OpenAIInstrumentor
from opentelemetry.sdk.trace import TracerProvider
SSE_CHUNKS = [
b'data: {"id":"chatcmpl-123","object":"chat.completion.chunk","model":"gpt-4o","choices":[{"index":0,"delta":{"role":"assistant","content":"Hello"},"finish_reason":null}]}\n\n',
b'data: {"id":"chatcmpl-123","object":"chat.completion.chunk","model":"gpt-4o","choices":[{"index":0,"delta":{"content":"!"},"finish_reason":"stop"}]}\n\n',
b"data: [DONE]\n\n",
]
def make_mock_response():
async def aiter_bytes(_=None):
for chunk in SSE_CHUNKS:
yield chunk
mock_request = MagicMock(spec=httpx.Request)
mock_request.headers = httpx.Headers({"X-Stainless-Raw-Response": "true"})
mock_response = MagicMock(spec=httpx.Response)
mock_response.status_code = 200
mock_response.headers = httpx.Headers({"content-type": "text/event-stream", "x-request-id": "abc"})
mock_response.aiter_bytes = aiter_bytes
mock_response.aclose = AsyncMock()
mock_response.request = mock_request
mock_response.http_version = "HTTP/1.1"
mock_response.elapsed = MagicMock()
return mock_response
# Verbatim copy of LiteLLM's make_azure_openai_chat_completion_request
async def make_azure_openai_chat_completion_request(azure_client, data, timeout):
raw_response = await azure_client.chat.completions.with_raw_response.create(
**data, timeout=timeout
)
headers = dict(raw_response.headers) # <-- crashes here
response = raw_response.parse()
return headers, response
async def reproducer():
OpenAIInstrumentor().instrument(tracer_provider=TracerProvider())
client = AsyncAzureOpenAI(
api_key="test", azure_endpoint="https://test.openai.azure.com", api_version="2024-02-15-preview"
)
with patch.object(client._client, "send", new_callable=AsyncMock, return_value=make_mock_response()):
headers, response = await make_azure_openai_chat_completion_request(
client, {"model": "gpt-4o", "messages": [{"role": "user", "content": "hi"}], "stream": True}, 60.0
)
print("headers:", headers)
asyncio.run(reproducer())
Actual Behavior
AttributeError: 'StreamWrapper' object has no attribute 'headers'
Observed in production when OpenAI v2 instrumentation was active and LiteLLM's Azure provider called with_raw_response.create(stream=True).
Expected Behavior
raw_response.headers should return the HTTP response headers from the original LegacyAPIResponse, same as when uninstrumented.
raw_response.parse() should return the StreamWrapper itself for iteration.
Suggested Fix
Capture LegacyAPIResponse.headers before calling .parse(), and store them directly on the wrapper:
# Before result.parse():
raw_headers = getattr(result, "headers", None)
parsed_result = result.parse()
if is_streaming(kwargs):
return ChatStreamWrapper(parsed_result, ..., raw_headers=raw_headers)
# In ChatStreamWrapper / _ChatStreamMixin:
self._self_raw_headers = raw_headers
@property
def headers(self):
return self._self_raw_headers
def parse(self):
return self
Environment
opentelemetry-instrumentation-openai-v2: latest main
openai: 1.82.0
litellm: 1.x (Azure provider)
Python: 3.11
References
Description
The fix in open-telemetry/opentelemetry-python-contrib#4184 added
__getattr__toStreamWrapperto proxy unknown attributes toself.stream. However, accessing.headersstill fails when usingwith_raw_response.create(stream=True)with LiteLLM (or any client that accessesraw_response.headersdirectly on the stream wrapper).This is a follow-up to open-telemetry/opentelemetry-python-contrib#4032 and open-telemetry/opentelemetry-python-contrib#4113.
Root Cause
__getattr__proxies toself.stream, which is anAsyncStream(the result of callingresult.parse()). However,AsyncStreamdoes not have a.headersattribute — headers live on the originalLegacyAPIResponsethat was discarded before wrapping.The call chain in
patch.py:When a caller then does:
Steps to Reproduce
Actual Behavior
Observed in production when OpenAI v2 instrumentation was active and LiteLLM's Azure provider called
with_raw_response.create(stream=True).Expected Behavior
raw_response.headersshould return the HTTP response headers from the originalLegacyAPIResponse, same as when uninstrumented.raw_response.parse()should return theStreamWrapperitself for iteration.Suggested Fix
Capture
LegacyAPIResponse.headersbefore calling.parse(), and store them directly on the wrapper:Environment
opentelemetry-instrumentation-openai-v2: latest mainopenai: 1.82.0litellm: 1.x (Azure provider)Python: 3.11References