You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
docs(learnings): document Soniox customVocabulary and multilingual support
Soniox stt-rt-v4 supports customVocabulary: [...] (the equivalent of
Deepgram keyterm) and languages: [en, es, ...] for code-switching on a
single universal model. Add Soniox row to the multilingual transcriber
comparison, add a sample config, and call it out as the strongest pick
when code-switching detection AND vocabulary boosting are both needed.
Copy file name to clipboardExpand all lines: docs/learnings/assistants.md
+16Lines changed: 16 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -167,6 +167,22 @@ transcriber:
167
167
- technical-acronym
168
168
```
169
169
170
+
**Soniox supports the same idea as `customVocabulary`.** Soniox `stt-rt-v4` (a single universal model that handles all 60+ languages) accepts `customVocabulary: [...]` — an array of strings that biases recognition toward domain-specific terms. This is the Soniox equivalent of Deepgram `keyterm`, and unlike Deepgram nova-3, it works in multilingual mode without the English-bias caveat documented in [multilingual.md](multilingual.md). Pair with `languages: [en, es]` for code-switching plus vocabulary boost in the same call.
171
+
172
+
```yaml
173
+
transcriber:
174
+
provider: soniox
175
+
model: stt-rt-v4
176
+
language: en
177
+
languages: [en, es] # optional; omit for single-language
178
+
customVocabulary:
179
+
- your-brand-name
180
+
- industry-specific-term
181
+
- product-name
182
+
- tarjeta de combustible # non-English equivalents are fine
183
+
confidenceThreshold: 0.3
184
+
```
185
+
170
186
### Deepgram Flux: end-of-turn detection knobs
171
187
172
188
Vapi exposes all four of Deepgram Flux's end-of-turn detection parameters on the `transcriber` schema. They only apply when `model` starts with `flux-`.
Copy file name to clipboardExpand all lines: docs/learnings/multilingual.md
+19-2Lines changed: 19 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -25,8 +25,9 @@ Use one assistant with a multilingual transcriber and voice.
25
25
26
26
| Provider | Config | How it works |
27
27
|----------|--------|--------------|
28
-
|**Deepgram** (recommended) |`language: "multi"` on nova-3 | Auto-detects per-utterance across supported languages |
28
+
|**Deepgram** (recommended) |`language: "multi"` on nova-3 | Auto-detects per-utterance across supported languages. Supports `keyterm` for vocabulary boosting (since Nov 2025).|
29
29
|**Gladia**|`languageBehaviour: "automatic multiple languages"`| V2 solaria model with native code-switching |
30
+
|**Soniox**|`model: stt-rt-v4`, `languages: [en, es]`| Single universal model handles 60+ languages — no per-language model swap. Supports `customVocabulary: [...]` for vocabulary boosting and `maxEndpointDelayMs` for turn-taking tuning. |
30
31
|**Speechmatics**|`language: "en_es"`| Bilingual mode for Spanish+English |
31
32
|**AssemblyAI**|`language: "multi"`| Universal streaming multilingual model |
This is most visible when a Spanish-only customer is misrecognized as English on their first utterance, which then cascades — the assistant responds in English, the customer gets confused, and the loop continues.
265
266
266
-
**Recommendation for code-switching customers:** Use **Gladia Solaria** (`provider: gladia`, `languageBehaviour: automatic multiple languages`) instead of Deepgram `language: multi`. Solaria is built around code-switching as a first-class case and isn't biased by `keyterm` content the same way. See [Approach 1](#approach-1-single-static-agent) for the full transcriber comparison.
267
+
**Recommendation for code-switching customers:** Use **Gladia Solaria** (`provider: gladia`, `languageBehaviour: automatic multiple languages`) or **Soniox** (`provider: soniox`, `model: stt-rt-v4`, `languages: [en, es]`) instead of Deepgram `language: multi`. Both are built around code-switching as a first-class case and aren't biased by vocabulary-boost content the same way. Soniox is the strongest pick when you need code-switching detection AND vocabulary boosting in the same call — it supports `customVocabulary: [...]` (the equivalent of Deepgram's `keyterm`) on the same single universal model that handles all 60+ languages. See [Approach 1](#approach-1-single-static-agent) for the full transcriber comparison.
267
268
268
269
**If you must stay on Deepgram multi:** Keep `keyterm` short (under 20 entries), include the customer's expected non-English equivalents, and avoid English-only acronyms that have no foreign-language form.
0 commit comments