Seedance 2.0 Pro

Seedance 2.0 Pro
ByteDance · Video Generation
POST /v1/videos/generations

Multimodal video model for cinematic output from text, image, audio, or video inputs, with stable motion and consistent characters.

At a glance

FieldValue
Model idseedance-2-0-pro
Input modalitiesText, Image, Video, Audio
Output modalitiesVideo
Context window-
Weight precision-
RegionMalaysia
Featuresaudio_sync, camera_control, character_consistency
Native inferenceNo
NewNo
Supported endpointsPOST /v1/videos/generations

Pricing

ChargeSpecRate
T2V/I2V 480Pper second$0.139
T2V/I2V 720Pper second$0.300
T2V/I2V 1080Pper second$0.749
Video Input 480Pper second$0.342
Video Input 720Pper second$0.736
Video Input 1080Pper second$1.841

Example request

$curl https://api.empiriolabs.ai/v1/videos/generations \
> -H 'Authorization: Bearer $EMPIRIOLABS_API_KEY' \
> -H 'Content-Type: application/json' \
> -d '{"model": "seedance-2-0-pro", "prompt": "sunrise over the ocean", "duration": 6}'

Parameters

ParameterTypeRequiredDefaultDescription
promptstringyes-Scene description.
modeenumno"auto"auto: detect from inputs. t2v: text-to-video. i2v_first: animate first frame. i2v_both: morph between start (image) and end (image_end). reference: use image as visual reference. edit: edit attached video. extend: extend attached video. · Allowed: auto, t2v, i2v_first, i2v_both, reference, edit, extend
resolutionenumno"720p"Output resolution. Larger = higher fidelity but slower / more expensive. · Allowed: 480p, 720p, 1080p
aspect_ratioenumno"adaptive"adaptive: derive from input image. · Allowed: adaptive, 16:9, 9:16, 1:1, 4:3, 3:4, 21:9
custom_durationbooleannotrueIf false, the model decides clip length. If true, use the duration field.
durationnumberno5Clip length in seconds. Only used when custom_duration=true. · Range: 4 – 15
generate_audiobooleannotrueGenerate native audio with the video.
imagestringno-Reference image URL.
image_endstringno-End-frame image URL for i2v_both.
videostringno-Reference video URL for edit / extend.
negative_promptstringno""What to avoid.

Notes

Multimodal video from text, images, audio, and video inputs. Native audio-video sync, strong motion stability, consistent character handling.

Tip

  • Pair with Seedream 5.0 Lite for the reference image first when targeting lifelike-face cohesion across multiple inputs.

Uploaded media preprocessing

  • Video inputs are capped to 15 seconds for reference, edit, and extend workflows.
  • Uploaded video inputs are normalized to provider-compatible MP4 when needed.

Machine-readable schema: GET https://api.empiriolabs.ai/v1/models/seedance-2-0-pro.