Kling O3

Kling O3
Kling AI · Video Generation
POST /v1/videos/generations

Video model in Standard or Pro modes with Text-to-Video, Image-to-Video, Reference-to-Video, editing, native sound, and multi-scene transitions.

At a glance

FieldValue
Model idkling-o3
Input modalitiesText, Image, Video, Audio
Output modalitiesVideo
Context window-
Weight precision-
Featuresaudio, editing
Native inferenceNo
NewNo
Supported endpointsPOST /v1/videos/generations

Pricing

ChargeSpecRate
Standard T2V/I2Vper second$0.168
Standard T2V/I2V Soundper second$0.224
Standard Video Inputper second$0.252
Pro T2V/I2Vper second$0.224
Pro T2V/I2V Soundper second$0.280
Pro Video Inputper second$0.336
4K T2V/I2V/Refper second$0.525

Example request

$curl https://api.empiriolabs.ai/v1/videos/generations \
> -H 'Authorization: Bearer $EMPIRIOLABS_API_KEY' \
> -H 'Content-Type: application/json' \
> -d '{"model": "kling-o3", "prompt": "sunrise over the ocean", "duration": 6}'

Parameters

ParameterTypeRequiredDefaultDescription
promptstringyes-Multi-scene: pipe (|) or newline-separated prompts, optionally prefixed with duration like ‘5s: scene text’. Up to 6 scenes.
model_tierenumno"pro"standard: cheapest. pro: balanced quality. 4k: highest fidelity, longest render. · Allowed: standard, pro, 4k
workflowenumno"auto"auto: detect from inputs. t2v: text-to-video. i2v: image-to-video. video_edit: edit attached video. reference: use reference_images or reference_videos. · Allowed: auto, t2v, i2v, video_edit, reference
aspect_ratioenumno"16:9"Kling O3 supports landscape, square, and portrait only. · Allowed: 16:9, 1:1, 9:16
durationnumberno5Per-scene duration in seconds. · Range: 3 – 15
soundbooleannotrueGenerate native audio with the video.
keep_original_soundbooleannotruevideo_edit only. Keep audio from the source video.
imagestringno-Reference image URL for i2v.
image_endstringno-Optional last-frame image URL for image-to-video.
videostringno-Source video URL for video_edit.
reference_imagesstringno-Comma-separated image URLs for reference workflow.
reference_videosstringno-Comma-separated video URLs for reference workflow.

Notes

Video model in Standard or Pro modes with text-to-video, image-to-video, reference-to-video, editing, native sound, and multi-scene transitions.

Uploaded media preprocessing

  • Video inputs are capped to 10 seconds for video-edit and video-reference workflows.
  • Uploaded video inputs are normalized to provider-compatible MP4 when needed.
  • Kling O3 4K supports text, image, and image-only reference workflows. Use Standard or Pro for video inputs.

Machine-readable schema: GET https://api.empiriolabs.ai/v1/models/kling-o3.