Skip to content

Commit 129637a

Browse files
authored
docs: multispeaker is only for s2-pro (#64)
1 parent e3cc904 commit 129637a

2 files changed

Lines changed: 3 additions & 3 deletions

File tree

api-reference/openapi.json

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2185,7 +2185,7 @@
21852185
"type": "null"
21862186
}
21872187
],
2188-
"description": "Inline voice references for zero-shot cloning. Requires MessagePack (not JSON). For single speaker, provide an array of ReferenceAudio objects. For multiple speakers, provide an array of arrays where each inner array contains references for one speaker. The speaker index corresponds to the index in reference_id array. Example for multi-speaker: [[{audio, text}], [{audio, text}, {audio, text}]] for 2 speakers where speaker 1 has 2 reference samples.",
2188+
"description": "Inline voice references for zero-shot cloning. Requires MessagePack (not JSON). For single speaker, provide an array of ReferenceAudio objects. For multiple speakers, provide an array of arrays where each inner array contains references for one speaker. **Multi-speaker is only available with the S2-Pro model.** The speaker index corresponds to the index in reference_id array. Example for multi-speaker: [[{audio, text}], [{audio, text}, {audio, text}]] for 2 speakers where speaker 1 has 2 reference samples.",
21892189
"title": "References"
21902190
},
21912191
"reference_id": {
@@ -2206,7 +2206,7 @@
22062206
}
22072207
],
22082208
"default": null,
2209-
"description": "Voice model ID(s) from Fish Audio library or your custom models. For single speaker synthesis, provide a string. For multi-speaker synthesis (e.g., dialogue), provide an array of model IDs. When using multiple speakers, use speaker tags in your text like [0] and [1] to indicate which speaker should speak each part. Example: '[0]Hello![1]Hi there![0]How are you?' with reference_id: ['speaker-a-id', 'speaker-b-id']",
2209+
"description": "Voice model ID(s) from Fish Audio library or your custom models. For single speaker synthesis, provide a string. For multi-speaker synthesis (e.g., dialogue), provide an array of model IDs. **Multi-speaker is only available with the S2-Pro model.** When using multiple speakers, use speaker tags in your text like [0] and [1] to indicate which speaker should speak each part. Example: '[0]Hello![1]Hi there![0]How are you?' with reference_id: ['speaker-a-id', 'speaker-b-id']",
22102210
"title": "Reference Id"
22112211
},
22122212
"prosody": {

developer-guide/models-pricing/models-overview.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,7 @@ Fish Audio offers state-of-the-art text-to-speech models optimized for different
5252
<Card title="s2-pro" icon="star">
5353
**Fish Audio S2-Pro** - Our next-generation TTS model with best-in-class performance
5454
- Natural language control with `[bracket]` syntax — not limited to a fixed set (e.g., `[whispers sweetly]`, `[laughing nervously]`)
55-
- Multi-speaker dialogue support
55+
- Multi-speaker dialogue support **(S2-Pro exclusive)**
5656
- 80+ languages
5757
- 100ms time-to-first-audio
5858
- Full SGLang-based serving stack

0 commit comments

Comments
 (0)