Skip to content

Add WebSocket connection timing and reuse metrics#5145

Open
theomonnom wants to merge 5 commits intomainfrom
claude/slack-add-websocket-connection-time-9PfER
Open

Add WebSocket connection timing and reuse metrics#5145
theomonnom wants to merge 5 commits intomainfrom
claude/slack-add-websocket-connection-time-9PfER

Conversation

@theomonnom
Copy link
Member

Summary

This PR adds comprehensive WebSocket connection timing and reuse metrics across the LiveKit agents framework and plugins. It introduces a new ConnectionResult dataclass to track connection acquisition time and whether connections were reused from a pool, and propagates this information through the metrics system.

Key Changes

Core Framework

  • ConnectionPool Enhancement: Added ConnectionResult dataclass to encapsulate connection objects with timing metadata (connect_time and from_pool flag)
  • New Methods:
    • get_with_timing(): Returns ConnectionResult instead of just the connection
    • connection_with_timing(): Context manager variant that yields ConnectionResult
    • Existing get() and connection() methods now delegate to the timing variants for consistency
  • Metrics Classes: Extended STTMetrics, TTSMetrics, and RealtimeModelMetrics with two new optional fields:
    • websocket_connection_time: Time in seconds to establish/acquire the connection
    • websocket_connection_reused: Boolean indicating if connection was reused from pool

Plugin Implementations

  • ElevenLabs TTS: Updated to use current_connection() returning tuple with reuse flag; tracks connection timing in _Connection.connect()
  • OpenAI Realtime: Added connection timing tracking in _create_ws_conn() with debug logging
  • Google Realtime: Added connection timing in _main_task() when establishing Gemini API connection
  • Cartesia TTS: Migrated from connection() to connection_with_timing() to capture metrics
  • Deepgram TTS: Migrated to connection_with_timing() for metrics capture
  • Deepgram STT: Added explicit timing measurement in _connect_ws() with debug logging
  • Google STT: Migrated to connection_with_timing() for metrics capture

Base Classes

  • STTProcessor and TTSProcessor: Added _ws_connection_time and _ws_connection_reused attributes; propagate these to metrics in _metrics_monitor_task() and _emit_metrics()

Implementation Details

  • Connection timing uses time.perf_counter() for high-resolution measurements
  • All timing fields are optional (float | None, bool | None) to maintain backward compatibility
  • Debug logging added to track connection acquisition with context (e.g., segment_id, context_id)
  • Pool reuse detection is automatic via the ConnectionResult.from_pool flag
  • For plugins that create new connections each time (OpenAI, Google, Deepgram STT), from_pool is always False

https://claude.ai/code/session_017ngdECrv92KhXdPjiTdiZc

Add connection timing metrics to STT, TTS, and RealtimeModel to help
debug slow transitions between tasks. This change adds:

- ConnectionResult class to return timing metadata from ConnectionPool
- get_with_timing() and connection_with_timing() methods on ConnectionPool
- websocket_connection_time and websocket_connection_reused fields to:
  - STTMetrics
  - TTSMetrics
  - RealtimeModelMetrics

The new fields distinguish between initial connection establishment time
(for new connections) and pool acquisition time (for reused connections).

Updated plugins:
- OpenAI Realtime: tracks WebSocket connection time
- Google Realtime: tracks WebSocket connection time
- Deepgram STT: tracks WebSocket connection time
- Deepgram TTS: tracks connection pool timing
- Cartesia TTS: tracks connection pool timing
- ElevenLabs TTS: tracks connection reuse
- Google STT: tracks connection pool timing

https://claude.ai/code/session_017ngdECrv92KhXdPjiTdiZc
@chenghao-mou chenghao-mou requested a review from a team March 18, 2026 20:03
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 potential issue.

View 5 additional findings in Devin Review.

Open in Devin Review

Comment on lines +418 to +419
self._ws_connection_time = (
connection._connect_time if connection._connect_time else total_time
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot Mar 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 ElevenLabs TTS reports stale connection time for reused connections

The ws_connection_time logic at line 419-421 uses connection._connect_time if connection._connect_time else total_time. Since _connect_time is set during connect() for ALL connections (at tts.py:607), it is always non-None for both new and reused connections. For reused connections (is_reused=True), this reports the original WebSocket handshake time (e.g., 200ms) instead of the near-zero pool acquisition time (total_time). The comment on line 417-418 explicitly states the intent is to use total_time for reused connections, but the code never reaches that branch. The fix should condition on is_reused.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

claude added 3 commits March 18, 2026 20:16
Instead of adding websocket_connection_time and websocket_connection_reused
fields to STTMetrics, TTSMetrics, and RealtimeModelMetrics, set them as
span attributes (lk.ws.connection_time, lk.ws.connection_reused) on the
active OTEL span directly in each plugin.

https://claude.ai/code/session_017ngdECrv92KhXdPjiTdiZc
- Add `record_ws_connection()` helper in trace_types.py to reduce duplication
- Update log messages to clearly indicate "(new)" vs "(reused)" connections
- Remove unused `from opentelemetry import trace` imports from plugins

https://claude.ai/code/session_017ngdECrv92KhXdPjiTdiZc
…ments

- Add `status` property to ConnectionResult returning "reused" or "new"
- Remove verbose docstrings and field comments
- Simplify plugin log messages to use conn_result.status

https://claude.ai/code/session_017ngdECrv92KhXdPjiTdiZc
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View 7 additional findings in Devin Review.

Open in Devin Review

self._tts.current_connection(), self._conn_options.timeout
)
total_time = time.perf_counter() - start_time
ws_connection_time = connection._connect_time or total_time
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 or operator used instead of is not None for _connect_time fallback, incorrect for 0.0

On line 416, connection._connect_time or total_time uses Python's truthiness to decide between the original WS connect time and the fallback. Since _connect_time is float | None, if _connect_time were exactly 0.0, the or would incorrectly fall through to total_time. The correct pattern is connection._connect_time if connection._connect_time is not None else total_time. While 0.0 is practically impossible for a real WS handshake, this is a known anti-pattern with numeric types.

Suggested change
ws_connection_time = connection._connect_time or total_time
ws_connection_time = connection._connect_time if connection._connect_time is not None else total_time
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Each plugin now sets ATTR_WS_CONNECTION_TIME on the span directly
instead of calling a helper function. No new APIs introduced.

https://claude.ai/code/session_017ngdECrv92KhXdPjiTdiZc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants