fix(eot): tighten eot cancellation by speech acitivity by chenghao-mou · Pull Request #6274 · livekit/agents

chenghao-mou · 2026-06-29T21:19:59Z

Previously, the cancellation can be triggered by inference done event where background noise makes it flaky. This PR drops that path so it is now only based on STT/VAD SOS.

devin-ai-integration

Devin Review found 1 potential issue.

devin-ai-integration · 2026-06-29T21:26:36Z

        if self._end_of_turn_task is not None:
            # TODO(theomonnom): disallow cancel if the extra sleep is done
            self._end_of_turn_task.cancel()
-
-        task_func = (
-            _bounce_eou_task_with_speaking_guard
-            if isinstance(self._turn_detector, _StreamingTurnDetector)
-            else _bounce_eou_task
-        )
        # copy the last_speaking_time before awaiting (the value can change)
        self._end_of_turn_task = asyncio.create_task(
-            task_func(
+            _bounce_eou_task(
                self._last_speaking_time,
                self._last_final_transcript_time,
                self._user_turn_start,


🚩 Wider cancellation window when user resumes speech during endpointing

The removed _bounce_eou_task_with_speaking_guard raced _user_speaking_event.wait() against the bounce task. That event was set on INFERENCE_DONE with raw_accumulated_speech > 0 (audio_recognition.py:1246), which fires ~200-250ms before START_OF_SPEECH (gated by VAD's min_speech_duration). The new code relies solely on SOS cancelling _end_of_turn_task (audio_recognition.py:1236-1237), creating a timing window where the bounce could complete before SOS fires.

Concrete scenario: EOS at T=0 → bounce starts with min_delay=0.3s → user resumes at T=0.1 → INFERENCE_DONE detects speech at T=0.15 (old code cancels here) → bounce completes at T≈0.3 → SOS fires at T≈0.35 (too late in new code).

However, this appears intentional: (1) text-based turn detectors always used raw _bounce_eou_task with this same gap, so the PR makes behavior consistent; (2) the old _user_speaking_event had robustness issues with sub-threshold spikes getting stuck, which was the regression the old TestSubThresholdSpeakingSpike tests covered; (3) all _run_eou_detection call sites already ensure _speaking=False, making the entry guard redundant.

(Refers to lines 1585-1595)

Was this helpful? React with 👍 or 👎 to provide feedback.

fix(eot): tighten eot cancellation by speech acitivity

9d7841e

Previously, the cancellation can be triggered by inference done event where background noise makes it flaky. This PR drops that path so it is now only based on STT/VAD SOS.

chenghao-mou requested a review from a team as a code owner June 29, 2026 21:19

devin-ai-integration Bot reviewed Jun 29, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(eot): tighten eot cancellation by speech acitivity#6274

fix(eot): tighten eot cancellation by speech acitivity#6274
chenghao-mou wants to merge 1 commit into
mainfrom
chenghao/fix/tighten-eot-cancel-by-speech

chenghao-mou commented Jun 29, 2026

Uh oh!

devin-ai-integration Bot left a comment

Uh oh!

devin-ai-integration Bot Jun 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

chenghao-mou commented Jun 29, 2026

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot Jun 29, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant