MiniMax M3 | EmpirioLabs AI Docs

MiniMax · Text Generation

POST /v1/chat/completions

MiniMax M3 is a multimodal reasoning model for coding, agents, and long-context analysis with text, image, and video input.

At a glance

Field	Value
Model id	`minimax-m3`
Model release date	2026-06-01
Input modalities	Text, Image, Video
Output modalities	Text
Context window	1M
Weight precision	-
Max output tokens	524,288
Region	Singapore
Features	reasoning, vision, video, web_search, function_calling, cache, long_context
Native inference	No
New	Yes
Structured output	JSON Mode
Supported endpoints	`POST /v1/chat/completions`, `POST /v1/responses`, `POST /v1/messages`, `POST /v1beta/models/minimax-m3:generateContent`
Alternate model ids	`minimax-m3-standard`, `minimax/m3`, `minimax/m3-standard`

Pricing

Charge	Spec	Rate
Input	per 1M prompt tokens	<=512K $0.225 (was $0.30); >512K $0.45 (was $0.60)
Output	per 1M generated tokens	<=512K $0.90 (was $1.20); >512K $1.80 (was $2.40)
Implicit cache read	per 1M cached input tokens	<=512K $0.045 (was $0.06); >512K $0.09 (was $0.12)
Web Search (Linkup)	per call when invoked	$0.013

Example request

$ curl https://api.empiriolabs.ai/v1/chat/completions \
>   -H 'Authorization: Bearer $EMPIRIOLABS_API_KEY' \
>   -H 'Content-Type: application/json' \
>   -d '{"model": "minimax-m3", "messages": [{"role":"user","content":"Hello"}]}'

Parameters

Parameter	Type	Required	Default	Description
`temperature`	number	no	`1`	Controls randomness. Lower values are more deterministic; higher values are more exploratory. · Range: 0 – 2
`top_p`	number	no	`0.95`	Controls nucleus sampling by limiting generation to the most likely token mass. · Range: 0 – 1
`max_completion_tokens`	integer	no	`4096`	Maximum generated tokens, including reasoning tokens when thinking is enabled. · Range: 1 – 524288
`stop`	array	no	-	Optional stop sequence or list of stop sequences.
`enable_thinking`	boolean	no	true	Enable adaptive model thinking before answering. Set false to request a direct answer without a reasoning phase.
`web_search_linkup`	boolean	no	false	Optional web search powered by Linkup. When enabled, recent web sources are retrieved using your latest user message as the query and provided to the model as additional context. Adds $0.013 per call when invoked on top of the model’s normal token cost. Disabled by default.
`tools`	array	no	-	OpenAI-compatible tool definitions for function calling.
`tool_choice`	object	no	-	Optional OpenAI-compatible tool_choice value.
`response_format`	enum	no	-	Return the output as a valid JSON object (JSON mode). Describe the fields you want in your prompt.

Notes

Text, image, and video input with text output. Thinking defaults to adaptive and can be disabled. Pricing is based on input tokens including cache hits: <=512K uses the discounted standard tier, while >512K uses the high-context tier. Supports up to 1M input tokens. Linkup web search is available through web_search_linkup and adds $0.013 per successful search.

Variants

`:priority`

Field	Value
Model id	`minimax-m3:priority`
Model release date	2026-06-01
Region	Singapore
Context window	1M
Weight precision	-
Max output tokens	524,288
Features	reasoning, vision, video, web_search, function_calling, cache, long_context
Native inference	No
Structured output	JSON Mode
Supported endpoints	`POST /v1/chat/completions`, `POST /v1/responses`, `POST /v1/messages`, `POST /v1beta/models/minimax-m3:priority:generateContent`

Pricing

Charge	Spec	Rate
Input	per 1M prompt tokens	<=512K $0.3375 (was $0.45); >512K $0.675 (was $0.90)
Output	per 1M generated tokens	<=512K $1.35 (was $1.80); >512K $2.70 (was $3.60)
Implicit cache read	per 1M cached input tokens	<=512K $0.0675 (was $0.09); >512K $0.135 (was $0.18)
Web Search (Linkup)	per call when invoked	$0.013

Parameters

Parameter	Type	Required	Default	Description
`temperature`	number	no	`1`	Controls randomness. Lower values are more deterministic; higher values are more exploratory. · Range: 0 – 2
`top_p`	number	no	`0.95`	Controls nucleus sampling by limiting generation to the most likely token mass. · Range: 0 – 1
`max_completion_tokens`	integer	no	`4096`	Maximum generated tokens, including reasoning tokens when thinking is enabled. · Range: 1 – 524288
`stop`	array	no	-	Optional stop sequence or list of stop sequences.
`enable_thinking`	boolean	no	true	Enable adaptive model thinking before answering. Set false to request a direct answer without a reasoning phase.
`web_search_linkup`	boolean	no	false	Optional web search powered by Linkup. When enabled, recent web sources are retrieved using your latest user message as the query and provided to the model as additional context. Adds $0.013 per call when invoked on top of the model’s normal token cost. Disabled by default.
`tools`	array	no	-	OpenAI-compatible tool definitions for function calling.
`tool_choice`	object	no	-	Optional OpenAI-compatible tool_choice value.
`response_format`	enum	no	-	Return the output as a valid JSON object (JSON mode). Describe the fields you want in your prompt.

Notes

This is the priority service tier: requests receive priority admission for faster response times and improved reliability, priced above the standard MiniMax M3 route. Text, image, and video input with text output. Thinking defaults to adaptive and can be disabled. Pricing is tiered by input tokens including cache hits, with <=512K cheaper than >512K. Supports up to 1M input tokens. Linkup web search is available through web_search_linkup and adds $0.013 per successful search.

Machine-readable schema: GET https://api.empiriolabs.ai/v1/models/minimax-m3.