fix(openai): skip realtime truncate when no audio was played#6249
fix(openai): skip realtime truncate when no audio was played#6249C1-BA-B1-F3 wants to merge 1 commit into
Conversation
When a RealtimeModel is interrupted before any audio frame has been committed (audio_end_ms == 0), sending a conversation.item.truncate causes the OpenAI Realtime API to reject with: 'Only model output audio messages can be truncated' Fix: when audio_end_ms is 0, check whether the server already holds the item. If so, delete it (so it doesn't dangle in the remote chat ctx); otherwise do nothing. Fixes livekit#6157 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
| else: | ||
| # No audio was committed yet β truncating with audio_end_ms=0 | ||
| # causes the Realtime API to reject with "Only model output | ||
| # audio messages can be truncated". If the server already holds | ||
| # the (unplayed) item, delete it so it doesn't dangle. | ||
| remote_ids = { | ||
| item.id for item in self._remote_chat_ctx.to_chat_ctx().items | ||
| } | ||
| if message_id in remote_ids: | ||
| self.send_event( | ||
| ConversationItemDeleteEvent( | ||
| type="conversation.item.delete", | ||
| item_id=message_id, | ||
| event_id=utils.shortuuid("chat_ctx_delete_"), | ||
| ) | ||
| ) |
There was a problem hiding this comment.
π© Potential stale remote context if update_chat_ctx races with the delete
After truncate() fires a delete event (line 1636-1642), the item remains in _remote_chat_ctx until the server confirms with conversation.item.deleted. If update_chat_ctx (agent_activity.py:3654) runs before that confirmation arrives, its diff computation will still see the item in the remote context. This means it won't try to recreate it β but the server has already deleted it, leaving the remote and local contexts out of sync. In practice this is unlikely to cause user-visible issues because: (1) the update_chat_ctx call at line 3652 requires any_skipped to be true, which is a different condition from the audio_end_ms==0 partial-play scenario; and (2) even if it does occur, the next full update_chat_ctx call would reconcile the difference. Still, this is a potential timing concern worth being aware of.
Was this helpful? React with π or π to provide feedback.
Problem
When a RealtimeModel is interrupted in the narrow window after the model has declared an audio response but before the first audio frame is played, the plugin sends a
conversation.item.truncatewithaudio_end_ms=0. The OpenAI Realtime API rejects it:Event sequence triggering the bug:
response.createdresponse.output_item.addedβ message_id assignedresponse.content_part.addedβ modalities resolve to["audio", "text"]response.audio.deltahas NOT happened yettruncate(..., audio_end_ms=0)β server errorFix
In
RealtimeSession.truncate(), whenaudio_end_ms == 0:When
audio_end_ms > 0, behavior is unchanged (truncate event sent as before).Tests
Three new tests in
test_openai_realtime_chat_ctx.py:test_truncate_deletes_item_when_no_audio_playedβ item on server β delete eventtest_truncate_noop_when_no_audio_and_item_not_on_serverβ item not on server β no-optest_truncate_sends_event_when_audio_playedβaudio_end_ms > 0β truncate event (regression check)Fixes #6157