Tongyi-Embedding-Vision-Plus
Tongyi-Embedding-Vision-Plus

Alibaba Cloud · Embedding
POST /v1/chat/completionsMultimodal embedding model that produces independent vectors for text, image, and video inputs. Use this when each content element needs its own embedding (e.g. matching a caption against a set of images).
At a glance
Pricing
Example request
Parameters
Notes
Embedding dimension: fixed at 1152.\n\nPer-input limits:\n\n- Text: up to 1,024 tokens\n- Images: up to 8 per request, max 3 MB each (JPG, PNG, BMP)\n- Video: up to 10 MB per file (MP4, MPEG, MOV, MPG, WEBM, AVI, FLV, MKV)\n\nOutput: independent vector per input element (no fusion).\n\nLanguages: Chinese and English.
Machine-readable schema: GET https://api.empiriolabs.ai/v1/models/tongyi-embedding-vision-plus.
