Mistral Small 4

Mistral AI · Text Generation

POST /v1/chat/completions

Hybrid model unifying Instruct, Reasoning (Magistral), and Devstral families: 40% lower completion time and 3x throughput vs Small 3.

At a glance

Field	Value
Model id	`mistral-small-4`
Model release date	2026-03-16
Input modalities	Text, Image
Output modalities	Text
Context window	256K
Weight precision	-
Max output tokens	65,536
Features	vision, web_search, function_calling
Native inference	No
New	No
Structured output	JSON Mode
Batch API	35% off list price
Supported endpoints	`POST /v1/chat/completions`, `POST /v1/responses`, `POST /v1/messages`, `POST /v1beta/models/mistral-small-4:generateContent`
Alternate model ids	`mistralai/mistral-small-4`

Pricing

Charge	Spec	Rate
Input	per 1M prompt tokens	$0.15
Output	per 1M generated tokens	$0.60
Standard Web Search	per call	$0.084
Premium Web Search	per call	$0.140
Code Interpreter	per call	$0.084
Image Generation	per image	$0.280

Example request

$ curl https://api.empiriolabs.ai/v1/chat/completions \
>   -H 'Authorization: Bearer $EMPIRIOLABS_API_KEY' \
>   -H 'Content-Type: application/json' \
>   -d '{"model": "mistral-small-4", "messages": [{"role":"user","content":"Hello"}]}'

Parameters

Parameter	Type	Required	Default	Description
`reasoning_enabled`	boolean	no	true	Enable extended reasoning (maps to reasoning: high\|none)
`tool_web_search`	boolean	no	true	Enable web_search tool
`web_search_tier`	enum	no	`"standard"`	Standard or Premium web-search tier. Premium uses higher-quality sources. · Allowed: `standard`, `premium`
`tool_code_interpreter`	boolean	no	true	Allow the model to execute Python code in a sandbox to compute / analyze data.
`tool_image_generation`	boolean	no	true	Allow the model to generate images inline via the platform image-gen tool.
`temperature`	number	no	`0.7`	Sampling temperature. 0 = deterministic, 2 = maximum randomness. · Range: 0 – 1.5
`max_tokens`	number	no	`4096`	Maximum tokens in the response. · Range: 1 – 32768
`disable_formatting`	boolean	no	false	Skip the EmpirioLabs Markdown formatting (citation [N] rewriting + References block when the web_search tool was used). The raw upstream answer with plain [N] citations is returned.
`response_format`	enum	no	-	Return the output as a valid JSON object (JSON mode). Describe the fields you want in your prompt.

Notes

Tools (web search, code interpreter, image generation) are billed only when actually invoked. Requests that include your own function tools use standard function calling, and the built-in tools are unavailable on those requests: a request cannot combine both at once.

Per-tool billing (usage.tool_usage)

When this model invokes built-in tools (web search, code interpreter, etc.) inside a single request, the response carries a normalized usage.tool_usage map alongside the token counts. The example below shows the shape — exact field names, units, and which tools appear can vary slightly per provider:

1 "usage": {
2   "prompt_tokens": 123,
3   "completion_tokens": 456,
4   "cost_usd": 0.0042,
5   "tool_usage": {"web_search": 3, "code_interpreter": 1}
6 }

The tool counts are already factored into cost_usd — they are surfaced for transparency so you can audit per-tool billing. The field is omitted when no tools were invoked.

Machine-readable schema: GET https://api.empiriolabs.ai/v1/models/mistral-small-4.