Tongyi Embedding Vision Plus

Alibaba Cloud · Embeddings
POST /v1/embeddingsMultimodal embedding producing independent vectors for text, image, and video inputs.
At a glance
Pricing
Example request
Parameters
Notes
Output
- Fixed 1152-dim vector per input (no fusion across modalities)
Per-input limits
- Text: up to 1,024 tokens
- Image: up to 8 per request, 3 MB each (JPG, PNG, BMP)
- Video: up to 10 MB per file (MP4, MPEG, MOV, MPG, WEBM, AVI, FLV, MKV)
Languages
- Chinese, English
Machine-readable schema: GET https://api.empiriolabs.ai/v1/models/tongyi-embedding-vision-plus.
