input | string | yes | — | Text to synthesize (≤4000 chars). Audio tags supported: [whispers], [laughs], [excited], [sigh], [shouting]. |
mode | enum | no | "single" | single: one voice. multi: two-speaker dialog (use [Speaker1]: / [Speaker2]: prefixes in input). · Allowed: single, multi |
language | string | no | "en-US" | BCP-47 code. 24 GA + 50+ preview supported. Common: en-US, en-IN, ja-JP, ko-KR, fr-FR, de-DE, … |
voice | enum | no | "Charon" | Single mode + first voice for multi mode. 30 distinct Gemini TTS voices, each with a unique timbre. · Allowed: Zephyr, Puck, Charon, Kore, Fenrir, Leda, Orus, Aoede, Callirrhoe, Autonoe, Enceladus, Iapetus, Umbriel, Algieba, Despina, Erinome, Algenib, Rasalgethi, Laomedeia, Achernar, Alnilam, Schedar, Gacrux, Pulcherrima, Achird, Zubenelgenubi, Vindemiatrix, Sadachbia, Sadaltager, Sulafat |
voice2 | enum | no | "Kore" | Second voice for multi mode. · Allowed: Zephyr, Puck, Charon, Kore, Fenrir, Leda, Orus, Aoede, Callirrhoe, Autonoe, Enceladus, Iapetus, Umbriel, Algieba, Despina, Erinome, Algenib, Rasalgethi, Laomedeia, Achernar, Alnilam, Schedar, Gacrux, Pulcherrima, Achird, Zubenelgenubi, Vindemiatrix, Sadachbia, Sadaltager, Sulafat |
speaker1_name | string | no | "Speaker1" | Multi mode label for the first speaker (alphanumeric). |
speaker2_name | string | no | "Speaker2" | Multi mode label for the second speaker (alphanumeric). |
output_format | enum | no | "WAV" | Allowed: WAV, MP3, OGG, ALAW, MULAW |
speed | number | no | 1 | Speaking rate multiplier (step 0.25). · Range: 0.25 – 2 |
volume_gain | number | no | 0 | Output gain in dB. · Range: -96 – 16 |
sample_rate | enum | no | "24000" | Output sample rate in Hz. · Allowed: 8000, 16000, 22050, 24000, 44100, 48000 |
style_prompt | string | no | — | Free-form style guidance, prepended to input (≤4000 chars). Examples: ‘Read with a calm, professional tone’ or ‘Speak excitedly’. |