Generate audio (TTS, music, podcast)

Text-to-speech, music generation, and multi-speaker podcast TTS share this endpoint. Returns a hosted URL by default; pass response_format: "b64_json" for inline audio bytes.

Authentication

AuthorizationBearer

Pass your EmpirioLabs API key as a bearer token. The Anthropic-style x-api-key header is also accepted on every endpoint.

Request

This endpoint expects an object.
modelstringRequired
inputstringOptional

Script / lyrics. Use [S1] / [S2] tags for multi-speaker models.

promptstringOptional

Music generation models use prompt instead of input.

voicestringOptional
output_formatstringOptional
durationintegerOptional

Music generation only — output length in seconds.

Response

Audio response (URL by default, or inline bytes).

datalist of objects