Skip to content

feat(rime): add WebSocket streaming TTS support#1497

Merged
toubatbrian merged 3 commits into
mainfrom
coolly-plover-notional
May 14, 2026
Merged

feat(rime): add WebSocket streaming TTS support#1497
toubatbrian merged 3 commits into
mainfrom
coolly-plover-notional

Conversation

@rosetta-livekit-bot
Copy link
Copy Markdown
Contributor

@rosetta-livekit-bot rosetta-livekit-bot Bot commented May 13, 2026

Summary

Adds opt-in WebSocket streaming to the Rime TTS plugin via a new use_websocket=True constructor argument. The existing HTTP synthesize path is unchanged and remains the default. When enabled, the plugin sets streaming=True and aligned_transcript=True during construction, opens a long-lived pooled WebSocket to Rime's /ws3 endpoint, and emits word-level timestamps via push_timed_transcript.

New constructor arguments

  • use_websocket: bool = False — opt into the streaming path. Off by default so existing consumers see no behavior change.
  • ws_base_url: str = "wss://users-ws.rime.ai" — overridable for self-hosted deployments, parallel to the existing base_url.
  • segment: NotGivenOr[str] = NOT_GIVEN — passed to Rime as a connect-time query param. Defaults to "bySentence" (server-side sentence buffering, mirrors StreamAdapter semantics). Pass "immediate" if the consumer is already feeding sentence-tokenized text and wants to skip server-side buffering.
  • tokenizer: NotGivenOr[tokenize.SentenceTokenizer] = NOT_GIVEN — overridable client-side sentence tokenizer. Defaults to tokenize.blingfire.SentenceTokenizer(). Mirrors the hook Cartesia exposes.

Implementation

The streaming class is similar to the implementation in the Cartesia plugin: single-context JSON-envelope WebSocket, base64 PCM audio frames, weakref.WeakSet[SynthesizeStream] for cleanup, utils.ConnectionPool[aiohttp.ClientWebSocketResponse] with max_session_duration=300 and mark_refreshed_on_get=True. Word timestamps are pushed as TimedString.

Connection lifecycle:

  • _connect_ws opens the pooled WebSocket using the URL built from current options. Connect-time errors propagate to the outer _run exception block, which classifies aiohttp.ClientResponseError (covering WSServerHandshakeError) as APIStatusError with the HTTP status code preserved.
  • _close_ws follows the graceful-shutdown pattern in the Deepgram plugin: send the eos operation, wait one second for the server's ack, suppress-and-log any send or recv errors during teardown so they don't mask the original cause that evicted the connection from the pool.
  • update_options invalidates the pool when the WebSocket URL changes, computed via a before/after _ws_url() diff. This automatically handles model swaps, speaker swaps, and any per-model option that participates in the URL.

A small _model_params(opts) helper consolidates the per-model option walking shared between the WebSocket query string and the HTTP JSON body.

Routes through /ws3, which accepts every model the plugin supports (mistv2, mistv3, arcana). The older /ws2 endpoint is not wired in.

Validating

  • update_options mid-session: model swap drops the existing pooled connection and reconnects with the new URL. Verified by observing two distinct _connect_ws calls and matching audio output.
  • Error propagation: invalid API key surfaces as APIStatusError(status_code=401) with the server message preserved, rather than a generic APIConnectionError.
  • Empty-input fast-fail: tts.stream() followed by end_input() with no push_text() raises APIError immediately at the protocol layer rather than hanging on the receive timeout.
  • Pool reuse: streams created within the max_session_duration window share the same WebSocket — no new handshake.
  • HTTP path unchanged: with use_websocket=False (default), synthesize() behavior is identical to before; _run payload assembly continues to use the same _model_params helper plus HTTP-only fields (samplingRate, reduceLatency for mistv2).

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented May 13, 2026

🦋 Changeset detected

Latest commit: 06e7416

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 31 packages
Name Type
@livekit/agents-plugin-rime Patch
@livekit/agents Patch
@livekit/agents-plugin-anam Patch
@livekit/agents-plugin-assemblyai Patch
@livekit/agents-plugin-baseten Patch
@livekit/agents-plugin-bey Patch
@livekit/agents-plugin-cartesia Patch
@livekit/agents-plugin-cerebras Patch
@livekit/agents-plugin-deepgram Patch
@livekit/agents-plugin-elevenlabs Patch
@livekit/agents-plugin-fishaudio Patch
@livekit/agents-plugin-google Patch
@livekit/agents-plugin-hedra Patch
@livekit/agents-plugin-hume Patch
@livekit/agents-plugin-inworld Patch
@livekit/agents-plugin-lemonslice Patch
@livekit/agents-plugin-liveavatar Patch
@livekit/agents-plugin-livekit Patch
@livekit/agents-plugin-minimax Patch
@livekit/agents-plugin-mistral Patch
@livekit/agents-plugin-mistralai Patch
@livekit/agents-plugin-neuphonic Patch
@livekit/agents-plugin-openai Patch
@livekit/agents-plugin-phonic Patch
@livekit/agents-plugin-resemble Patch
@livekit/agents-plugin-runway Patch
@livekit/agents-plugin-sarvam Patch
@livekit/agents-plugin-silero Patch
@livekit/agents-plugin-trugen Patch
@livekit/agents-plugin-xai Patch
@livekit/agents-plugins-test Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

devin-ai-integration[bot]

This comment was marked as resolved.

@toubatbrian toubatbrian merged commit 4354df8 into main May 14, 2026
8 of 9 checks passed
@toubatbrian toubatbrian deleted the coolly-plover-notional branch May 14, 2026 23:44
@github-actions github-actions Bot mentioned this pull request May 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants