Stable Audio 2.0

Stability AI · Audio Generation
POST /v1/audio/generationsGenerates audio up to 3 minutes from text prompts, supporting text-to-audio and audio-to-audio with adjustable duration, steps, and CFG scale.
At a glance
Pricing
Example request
Parameters
Notes
Generates up to 3 minutes of audio from text or via audio-to-audio transformation.
Audio-to-audio mode
- Requires BOTH a prompt and an uploaded audio file
- Recommended CFG scale: 7-15
- Recommended steps: 6-8
- Typical strength: 0.3-0.7
Machine-readable schema: GET https://api.empiriolabs.ai/v1/models/stable-audio-2-0.
