Models from Google
Gemini-2.5-Flash-TTS
Low-latency text-to-speech with single- and multi-speaker voices and controllable style, accent, and expressive tone for production apps.
Gemini-2.5-Pro-TTS
High-quality TTS preview for podcasts, audiobooks, and customer support, with expressive multi-speaker voices across 23+ languages.
Gemini-3.1-Flash-TTS
Highly controllable TTS with new Audio Tags for precise style, tone, pace, and delivery across narration, assistants, and voice apps.
Gemma-3-27B
Open-source vision-language model with 128K context, 140+ languages, improved math/reasoning, structured outputs, and function calling.
