MiMo V2 Flash

POST /v1/chat/completionsLightweight, high-speed reasoning model with hybrid attention and multi-token prediction for low-cost inference and strong benchmark scores.
At a glance
Pricing
Example request
Parameters
Notes
Lightweight 256K-context tier. Web search ($0.015/call) is charged only when invoked. Cached input tokens are billed at a steep discount.
Per-tool billing (usage.tool_usage)
When this model invokes tools (web search, code interpreter, etc.) inside a single request, the response carries a normalized usage.tool_usage map alongside the token counts. The example below shows the shape — exact field names, units, and which tools appear can vary slightly per provider:
The tool counts are already factored into cost_usd — they are surfaced for transparency so you can audit per-tool billing. The field is omitted when no tools were invoked.
Machine-readable schema: GET https://api.empiriolabs.ai/v1/models/mimo-v2-flash.
