MiMo V2.5 Pro UltraSpeed

MiMo V2.5 Pro UltraSpeed
Xiaomi · Text Generation
POST /v1/chat/completions

Speed-tuned flagship that streams up to 1,000 tokens/second for real-time chat, live edits, and high-concurrency agentic workflows on a 1M context.

At a glance

FieldValue
Model idmimo-v2-5-pro-ultraspeed
Input modalitiesText
Output modalitiesText
Context window1M
Weight precision-
Max output tokens128,000
Featuresreasoning, agentic, fast
Native inferenceNo
NewYes
Supported endpointsPOST /v1/chat/completions, POST /v1/responses, POST /v1/messages

Pricing

ChargeSpecRate
Inputper 1M prompt tokens$2.175
Outputper 1M generated tokens$4.35
Implicit cache readper 1M cached input tokens$0.018
Web Search (Linkup)per call when invoked$0.013

Example request

$curl https://api.empiriolabs.ai/v1/chat/completions \
> -H 'Authorization: Bearer $EMPIRIOLABS_API_KEY' \
> -H 'Content-Type: application/json' \
> -d '{"model": "mimo-v2-5-pro-ultraspeed", "messages": [{"role":"user","content":"Hello"}]}'

Parameters

ParameterTypeRequiredDefaultDescription
enable_thinkingbooleannotrueEnable step-by-step reasoning before answering.
temperaturenumberno0.7Range: 0 – 2
top_pnumberno0.9Range: 0 – 1
max_tokensnumberno4096Range: 1 – 131072
stopstringno--
disable_formattingbooleannofalse-
web_search_linkupbooleannofalse-

Notes

Optimized for ultra-fast streaming on a 1M-token context. Cached input tokens are billed at a steep discount. Web search (Linkup) adds $0.013 per search when enabled.


Machine-readable schema: GET https://api.empiriolabs.ai/v1/models/mimo-v2-5-pro-ultraspeed.