docs(learnings): add .vapi-ignore lifecycle, pronunciation decision tree, name 40-char cap, PATCH semantics, ElevenLabs phoneme model compatibility

dhruva-reddy · dhruva-reddy · commit d6679c9d3e75 · 2026-05-08T10:37:32.000-07:00
Six wiki additions + AGENTS.md and CLAUDE.md routing fix to surface them: - yaml-conventions.md: 'Working with .vapi-ignore' — recovery flow ('was that not in the .vapi-ignore?'), cardinal rule against silent edits, anti-pattern of editing .vapi-ignore to suppress unexpected drift instead of resolving the cause. - assistants.md: 'Choosing the right pronunciation layer' — symptom -> layer decision tree (word misheard = transcriber, word mispronounced = TTS), with diagnostic question and forward/back cross-links from Transcriber Configuration and Pronunciation dictionaries (TTS-level) sections. - assistants.md: 'Assistant top-level name is limited to 1-40 characters' — separate enforcement site from structuredOutput.name, not surfaced in the public schema reference. - assistants.md: 'PATCH /assistant/:id semantics: shallow replacement at the top-level field' — wholesale replacement of object/array subtrees; safe-append pattern is GET -> mutate -> PATCH; explicit contrast with assistantOverrides which deep-merges per multilingual.md. - voice-providers.md: 'Pronunciation dictionary support: per-provider field shapes' — Cartesia (pronunciationDictId, sonic-3 only), ElevenLabs (pronunciationDictionaryLocators, dictionaryName upstream field NOT name), Vapi voices (schema-level support; dashboard UI in active PRISM-474 rollout; runtime needs call-test verification). Public-docs out-of-date callout. - voice-providers.md: 'ElevenLabs phoneme rule model compatibility' — alias rules universal; phoneme rules silently no-op'd on the default eleven_turbo_v2_5 and other current models. Customer impact: zero benefit, zero signal. Workarounds: alias-only authoring or pin to eleven_flash_v2. - AGENTS.md + CLAUDE.md: add yaml-conventions.md to the Learnings & recipes routing table (was missing, making any yaml-conventions.md content invisible to agents). Cross-checks performed: - multilingual.md:148-160 deep-merge claim verified to be about assistantOverrides, NOT PATCH. No wiki contradiction. - Customer-name + UUID scrubs clean against all additions. - Engine state re-verified: candidates that were superseded by recent main commits (vapi sync scoping by org-scoping/dry-run/drift; non- transactional pushes by snapshot-on-push #21 + validate #17) were dropped. Active platform bug PRISM-641 dropped per user instruction. Skipping code-reviewer per docs-only carve-out in the always-apply rule. All cross-references manually verified; anchor slugs match GitHub heading-to-slug conventions.
diff --git a/AGENTS.md b/AGENTS.md
@@ -32,6 +32,7 @@ This project manages **Vapi voice agent configurations** as code. All resources
 | Voicemail detection / VM vs human classification | `docs/learnings/voicemail-detection.md` |
 | Enforcing call time limits / graceful call ending | `docs/learnings/call-duration.md` |
 | Voice provider field cheat-sheet (Cartesia vs 11labs vs OpenAI etc.) | `docs/learnings/voice-providers.md` |
+| YAML authoring conventions, .vapi-ignore lifecycle | `docs/learnings/yaml-conventions.md` |
 
 ---
 
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -25,6 +25,7 @@ When both files exist, follow both. If guidance overlaps, treat `AGENTS.md` as t
    - Multilingual agents → `docs/learnings/multilingual.md`
    - WebSocket transport → `docs/learnings/websocket.md`
    - Call time limits / graceful ending → `docs/learnings/call-duration.md`
+   - YAML authoring conventions, .vapi-ignore lifecycle → `docs/learnings/yaml-conventions.md`
 
 ## Improvements log
 
diff --git a/docs/learnings/assistants.md b/docs/learnings/assistants.md
@@ -127,8 +127,27 @@ voice:
 
 ---
 
+## Choosing the right pronunciation layer
+
+Pronunciation problems live in two unrelated layers — picking the wrong one wastes a debugging cycle. Reproduce the failure first, then map symptom to layer.
+
+| Symptom | Fix on | How |
+|---|---|---|
+| Word **misheard** by the agent (e.g. STT decodes "VAT" as "that") | Transcriber (input side) | `customVocabulary` (Soniox), `keyterm` (Deepgram). See [Transcriber Configuration](#transcriber-configuration) for syntax. |
+| Word **mispronounced** by the agent (e.g. TTS reads "VAT" as "vee-ay-tee") | Voice / TTS (output side) | `pronunciationDictId` (Cartesia), `pronunciationDictionaryLocators` (ElevenLabs). See [Pronunciation dictionaries (TTS-level)](#pronunciation-dictionaries-tts-level) for the per-provider config. |
+
+**Diagnostic question:** Did the transcript record what the user actually said?
+- **No** — the STT got it wrong. Fix on the transcriber.
+- **Yes, but the agent then said it wrong** — the TTS is mispronouncing. Fix on the voice.
+
+Don't try both layers at once. They shape independent halves of the call and the wrong layer adds config noise without addressing the failure. For per-provider voice-side field shapes (Cartesia vs ElevenLabs vs Vapi), see [voice-providers.md → Pronunciation dictionary support](voice-providers.md#pronunciation-dictionary-support-per-provider-field-shapes).
+
+---
+
 ## Transcriber Configuration
 
+> **If a word is being misheard by the agent**, this is the right layer to fix it (input side). If a word is being mispronounced by the agent, fix the voice/TTS layer instead — see [Choosing the right pronunciation layer](#choosing-the-right-pronunciation-layer).
+
 ### Provider recommendations by language
 
 | Language | Recommended Provider |
@@ -282,12 +301,14 @@ startSpeakingPlan:
 
 ### Pronunciation dictionaries (TTS-level)
 
+> **If a word is being mispronounced by the agent**, this is the right layer to fix it (output side). If a word is being misheard, fix the transcriber instead — see [Choosing the right pronunciation layer](#choosing-the-right-pronunciation-layer). For per-provider voice-side field shapes, see [voice-providers.md → Pronunciation dictionary support](voice-providers.md#pronunciation-dictionary-support-per-provider-field-shapes).
+
 Pronunciation dictionaries control how TTS voices say specific words. They are **provider-specific**:
 
 | Provider | Support | Config field | Model requirement |
 |----------|---------|-------------|-------------------|
 | **Cartesia** | Full IPA + sounds-like across all languages | `pronunciationDictId` on voice config | `sonic-3` only |
-| **ElevenLabs** | Phoneme rules (IPA/CMU, English only) + alias rules (all languages) | `pronunciationDictionaryLocators` on voice config | Phoneme: `eleven_turbo_v2`, `eleven_flash_v2`. Alias: all models |
+| **ElevenLabs** | Phoneme rules (IPA/CMU, English only) + alias rules (all languages) | `pronunciationDictionaryLocators` on voice config | Alias: all models. Phoneme: model-dependent and silently no-op'd on most current models — see [voice-providers.md → ElevenLabs phoneme rule model compatibility](voice-providers.md#elevenlabs-phoneme-rule-model-compatibility). |
 | **Vapi built-in** | None | N/A | N/A |
 
 **Pronunciation dictionaries** are created via the Vapi API, then referenced by ID in the voice config. This is the same pattern as `credentialId` — the provider resource lives outside gitops, the reference is gitops-managed.
@@ -425,6 +446,19 @@ If a hook references a `toolId` that doesn't exist, Vapi logs a warning and cont
 
 `customer.speech.timeout` (hook) and `silenceTimeoutSeconds` (assistant) are separate mechanisms. The hook fires an action; the timeout ends the call. Configure them independently.
 
+### Assistant top-level `name` is limited to 1-40 characters
+
+The Vapi API enforces a hard 40-character maximum on the top-level `name` field of an assistant resource. Push-time error:
+
+```
+PATCH /assistant/<id> → 400
+name must be shorter than or equal to 40 characters
+```
+
+This is **a separate field from `structuredOutput.name`** — both share the 40-char cap, but the enforcement sites are independent (see [structured-outputs.md](structured-outputs.md#structuredoutputname-is-limited-to-1-40-characters)). The constraint is not surfaced in the public schema reference; it's only enforced server-side at PATCH/POST time.
+
+**Recommendation:** when generating descriptive assistant names from templates ("Triage Classifier — Multilingual Classic Variant" = 51 chars), trim before push or use shorter abbreviations. Put descriptive nuance in a comment in the YAML or in the system prompt body, not the `name` field.
+
 ### `silenceTimeoutSeconds` minimum is 10
 
 The Vapi API enforces a hard minimum of **10 seconds** on `silenceTimeoutSeconds`. Setting this field to anything less than 10 (e.g., `5` or `8`) will fail at push time with:
@@ -448,6 +482,30 @@ The minimum is not documented in the gitops engine README and is only surfaced w
 
 ---
 
+## PATCH /assistant/:id semantics: shallow replacement at the top-level field
+
+`PATCH /assistant/:id` is partial-update at the **top level only** — fields not in the request body stay untouched. But within each field you DO send, replacement is **wholesale, NOT deep-merged**. `PATCH { hooks: [oneNewHook] }` leaves the assistant with exactly one hook even if it had three before.
+
+The same shallow-replace rule applies to: `model.messages`, `analysisPlan`, `voice`, `transcriber`, `messagePlan`, `serverMessages`, and any other object or array field. Whatever subtree you send overwrites the entire subtree on the resource.
+
+**Safe-append pattern** — GET → mutate the returned array/object → PATCH the full structure back:
+
+```yaml
+# 1. GET /assistant/:id, capture existing.hooks
+# 2. Append your new hook locally
+# 3. PATCH with the full hooks array (existing + new)
+hooks:
+  - { ...existing hook 1 }
+  - { ...existing hook 2 }
+  - { ...new hook you wanted to add }
+```
+
+**Important distinction:** this is the REST API PATCH semantic. It is **different** from `assistantOverrides` in squad configs, which **deep-merges** partial nested objects per [multilingual.md → What Can Be Overridden](multilingual.md#what-can-be-overridden). When working through `assistantOverrides`, partial subtrees compose with the base assistant's config; when working through PATCH, partial subtrees replace.
+
+See also: [fallbacks.md](fallbacks.md#phone-number-fallback-hook) for the same gotcha applied to phone-number hooks.
+
+---
+
 ## Idle Messages (messagePlan)
 
 ### Defaults
diff --git a/docs/learnings/voice-providers.md b/docs/learnings/voice-providers.md
@@ -95,3 +95,75 @@ If a customer changes the provider on the dashboard and your local YAML still ha
 ## Adding a new provider
 
 If you find yourself reaching for a provider not in the table above, append a row here in the same PR. The cheat-sheet only stays useful if it grows with the platform.
+
+---
+
+## Pronunciation dictionary support: per-provider field shapes
+
+Pronunciation dictionaries do not share a field shape across voice providers. Same conceptual feature, three different surfaces.
+
+> **Public-docs note:** As of 2026-05-08 the public Vapi docs state pronunciation dictionaries are "exclusive to ElevenLabs voices." This is out of date — Cartesia has been confirmed in production deployments and Vapi-voice schema-level support is in active rollout (PRISM-474). Treat this wiki as the more current source.
+
+### Cartesia
+
+- **Field**: `voice.pronunciationDictId` — single string ID on the voice config.
+- **Model requirement**: `model: sonic-3` only. Other Cartesia models silently ignore the field.
+- **Upstream resource shape**: the Cartesia dictionary resource exposes a `name` field.
+- **Full config example**: see [assistants.md → Pronunciation dictionaries (TTS-level)](assistants.md#pronunciation-dictionaries-tts-level).
+
+### ElevenLabs
+
+- **Field**: `voice.pronunciationDictionaryLocators` — array of `{ pronunciationDictionaryId, versionId? }`.
+- **Model requirement**: alias rules work on all ElevenLabs models. **Phoneme rules are silently no-op'd** on `eleven_turbo_v2_5` (Vapi's default), `eleven_flash_v2_5`, `eleven_multilingual_v2`, and `eleven_v3`. See [ElevenLabs phoneme rule model compatibility](#elevenlabs-phoneme-rule-model-compatibility) below for the full breakdown.
+- **Upstream resource shape**: the ElevenLabs dictionary resource exposes a `dictionaryName` field — **NOT `name`**. This trips up wrappers that fetch dictionaries via API and surface them in tools that also handle Cartesia.
+
+### Vapi voices
+
+- **Schema-level**: accepts pronunciation dictionary configs at the API.
+- **Dashboard UI surface**: in active rollout (PRISM-474, Q2 2026). Schema acceptance does **not** guarantee runtime TTS engine honors the dictionary.
+- **Recommendation**: verify runtime behavior with a call test before depending on it for production Vapi-voice deployments.
+
+### Field shape gotcha
+
+The three provider families do NOT use the same field name on the upstream pronunciation-dictionary resource:
+
+| Provider | Upstream display-name field |
+|---|---|
+| Cartesia | `name` |
+| ElevenLabs | `dictionaryName` |
+| Vapi voices | shape pending finalization |
+
+If you're authoring a wrapper or migration tool that handles all three, gracefully handle the divergence. A single `name`-only path will silently render ElevenLabs dictionaries with empty labels.
+
+### ElevenLabs phoneme rule model compatibility
+
+ElevenLabs splits pronunciation rules into two types:
+
+- **Alias rules** — word substitution ("MyBrand" → "my-brand"). **Work universally** on all ElevenLabs models.
+- **Phoneme rules** — exact pronunciation via IPA / CMU Arpabet. **Model-dependent.**
+
+**Confirmed unsupported (silent no-op):**
+- `eleven_turbo_v2_5` — Vapi's default ElevenLabs model
+- `eleven_flash_v2_5`
+- `eleven_multilingual_v2`
+- `eleven_v3`
+
+**Confirmed supported:**
+- `eleven_flash_v2`
+- Likely `eleven_monolingual_v1` (ElevenLabs docs disagree across pages on the exact set — verify before depending on it)
+
+**Silent-skip behavior:** when a phoneme rule is sent to an unsupported model, ElevenLabs does NOT error. It bypasses the rule and uses standard pronunciation. **Customer impact:** attaching a phoneme-only dict to the default voice gets zero benefit with no signal — the call sounds exactly like the no-dict baseline.
+
+**Workarounds:**
+1. **Author dict as alias rules** — they work everywhere. Trade phoneme precision for portability.
+2. **Pin to `eleven_flash_v2`** — explicit model lock if phoneme accuracy matters more than the latency profile of `eleven_turbo_v2_5` / `eleven_flash_v2_5`.
+
+```yaml
+# Phoneme-rule-dependent — pin the model
+voice:
+  provider: 11labs
+  model: eleven_flash_v2
+  voiceId: <your-voice-id>
+  pronunciationDictionaryLocators:
+    - pronunciationDictionaryId: <your-dict-id>
+```
diff --git a/docs/learnings/yaml-conventions.md b/docs/learnings/yaml-conventions.md
@@ -183,6 +183,29 @@ The blank line after `---` is conventional; the strict requirement is just that
 
 ---
 
+## Working with `.vapi-ignore`
+
+`.vapi-ignore` lives at `resources/<org>/.vapi-ignore` and excludes specific resources from pull and push so the dashboard stays the source of truth for them. See `AGENTS.md` (line 13) for the basic gitignore-style syntax.
+
+The recovery flow when a sync surfaces "drift" you didn't expect — typically prompted by "was that not in the .vapi-ignore?":
+
+1. **Inspect first**, don't edit. Diff the file against `main` to see whether the path was already ignored:
+   ```bash
+   git diff origin/main -- resources/<org>/.vapi-ignore
+   ```
+2. **If a dashboard-only asset is genuinely missing from `.vapi-ignore`**, add the pattern. Otherwise stop here — the asset belongs in yaml.
+3. **Dry-run before applying** to confirm only the intended assets will change:
+   ```bash
+   npm run push -- <org> --dry-run
+   ```
+4. **Apply** once the dry-run is clean: `npm run push -- <org>`.
+
+**Cardinal rule:** don't edit `.vapi-ignore` without explicit user direction. The file encodes intentional dashboard-vs-yaml ownership splits the user (or an earlier customer-engagement decision) knows about. Removing a pattern silently re-claims an asset for gitops control, which can blow away dashboard-only edits on the next push.
+
+**Anti-pattern:** editing `.vapi-ignore` because a sync surfaced an unexpected diff is *removing the protection*, not fixing the cause. The cause is usually upstream: the asset was edited in both places, or a new asset that should be dashboard-owned was created via gitops. Resolve at the source, then leave `.vapi-ignore` alone.
+
+---
+
 ## Cross-references
 
 - `docs/learnings/assistants.md` — assistant-specific frontmatter authoring