HappyHorse 1.1 | EmpirioLabs AI Docs

Alibaba Cloud · Video Generation

POST /v1/videos/generations

Text, image, and reference-to-video in one model. Cinematic motion, character consistency across up to 9 references, and synchronized native audio.

At a glance

Field	Value
Model id	`happyhorse-1-1`
Model release date	2026-06-22
Input modalities	Text, Image
Output modalities	Video
Context window	-
Weight precision	-
Region	Singapore
Features	video_generation, image_to_video, reference_to_video, audio_sync
Native inference	No
New	Yes
Supported endpoints	`POST /v1/videos/generations`

Pricing

Charge	Spec	Rate
720p	per second	$0.14
1080p	per second	$0.18

Example request

$ curl https://api.empiriolabs.ai/v1/videos/generations \
>   -H 'Authorization: Bearer $EMPIRIOLABS_API_KEY' \
>   -H 'Content-Type: application/json' \
>   -d '{"model": "happyhorse-1-1", "prompt": "sunrise over the ocean", "duration": 6}'

Parameters

Parameter	Type	Required	Default	Description
`prompt`	string	yes	-	Describe the scene to generate. Up to 2,500 characters.
`mode`	enum	no	`"auto"`	Auto detects from attachments. t2v: text-to-video. i2v: animate one image. r2v: reference-to-video from up to 9 images. · Allowed: `auto`, `t2v`, `i2v`, `r2v`
`image`	string	no	-	Image URL or upload. One image for image-to-video, up to 9 for reference-to-video. Reference them in the prompt as character1, character2.
`aspect_ratio`	enum	no	`"16:9"`	Aspect ratio of the output. Used by text-to-video and reference-to-video; image-to-video follows the source image. · Allowed: `16:9`, `9:16`, `1:1`, `4:3`, `3:4`
`resolution`	enum	no	`"720p"`	Output resolution. 720p renders faster; 1080p is higher fidelity. · Allowed: `720p`, `1080p`
`duration`	number	no	`5`	Clip length in seconds (3 to 15). · Range: 3 – 15
`seed`	number	no	-	Reproducibility seed. Reuse the same seed for a repeatable result. · Range: 0 – 2147483647
`watermark`	boolean	no	false	Add a watermark to the generated video. Off by default.

Notes

Modes (in one model)

Text-to-video
Image-to-video
Reference-to-video (up to 9 images)

Constraints

720p / 1080p, 3 to 15 seconds per generation
Prompt up to 2,500 characters

Aspect ratios

Text and reference modes: 16:9, 9:16, 1:1, 4:3, 3:4
Image-to-video follows the source image

Native audio is generated automatically.

Machine-readable schema: GET https://api.empiriolabs.ai/v1/models/happyhorse-1-1.