MiMo V2.5 Pro UltraSpeed | EmpirioLabs AI Docs

Xiaomi · Text Generation

POST /v1/chat/completions

Speed-tuned flagship that streams up to 1,000 tokens/second for real-time chat, live edits, and high-concurrency agentic workflows on a 1M context.

At a glance

Field	Value
Model id	`mimo-v2-5-pro-ultraspeed`
Input modalities	Text
Output modalities	Text
Context window	1M
Weight precision	-
Max output tokens	128,000
Features	reasoning, agentic, fast
Native inference	No
New	Yes
Supported endpoints	`POST /v1/chat/completions`, `POST /v1/responses`, `POST /v1/messages`

Pricing

Charge	Spec	Rate
Input	per 1M prompt tokens	$2.175
Output	per 1M generated tokens	$4.35
Implicit cache read	per 1M cached input tokens	$0.018
Web Search (Linkup)	per call when invoked	$0.013

Example request

$ curl https://api.empiriolabs.ai/v1/chat/completions \
>   -H 'Authorization: Bearer $EMPIRIOLABS_API_KEY' \
>   -H 'Content-Type: application/json' \
>   -d '{"model": "mimo-v2-5-pro-ultraspeed", "messages": [{"role":"user","content":"Hello"}]}'

Parameters

Parameter	Type	Required	Default	Description
`enable_thinking`	boolean	no	true	Enable step-by-step reasoning before answering.
`temperature`	number	no	`0.7`	Range: 0 – 2
`top_p`	number	no	`0.9`	Range: 0 – 1
`max_tokens`	number	no	`4096`	Range: 1 – 131072
`stop`	string	no	-	-
`disable_formatting`	boolean	no	false	-
`web_search_linkup`	boolean	no	false	-

Notes

Optimized for ultra-fast streaming on a 1M-token context. Cached input tokens are billed at a steep discount. Web search (Linkup) adds $0.013 per search when enabled.

Machine-readable schema: GET https://api.empiriolabs.ai/v1/models/mimo-v2-5-pro-ultraspeed.