Qwen3-Max-Thinking | EmpirioLabs AI Docs

Provider: Alibaba Cloud
Category: Text Generation
Endpoint: POST /v1/chat/completions
Context window: 256K
Served from: Singapore

Reasoning model with adaptive tool use (search, memory, code interpreter) and test-time scaling for higher accuracy on complex tasks.

At a glance

Field	Value
Model id	`qwen3-max-thinking`
Input modalities	text
Output modalities	text
Context window	256K
Region	Singapore
Features	reasoning, code_interpreter, web_search, thinking
New	No
Native inference	No

Pricing

Charge	Spec	Rate
Input	≤32K, per 1M tokens	$1.08 (was$ 1.20)
Input	32K-128K, per 1M tokens	$2.16 (was$ 2.40)
Input	128K-256K, per 1M tokens	$2.70 (was$ 3.00)
Output	≤32K, per 1M tokens	$5.52 (was$ 6.00)
Output	32K-128K, per 1M tokens	$11.04 (was$ 12.00)
Output	128K-256K, per 1M tokens	$13.80 (was$ 15.00)

Example request

$ curl https://api.empiriolabs.ai/v1/chat/completions \
>   -H 'Authorization: Bearer $EMPIRIOLABS_API_KEY' \
>   -H 'Content-Type: application/json' \
>   -d '{"model": "qwen3-max-thinking", "messages": [{"role":"user","content":"Hello"}]}'

Parameters

Parameter	Type	Required	Default	Description
`temperature`	number	no	`0.7`	Sampling temperature · Range: 0 – 2
`top_p`	number	no	`1`	Nucleus sampling · Range: 0 – 1
`max_tokens`	number	no	`4096`	Max output tokens · Range: 1 – 65536
`frequency_penalty`	number	no	`0`	Range: -2 – 2
`presence_penalty`	number	no	`0`	Range: -2 – 2
`stream`	boolean	no	false	Server-Sent Events streaming
`stop`	string	no	—	Comma-separated stop sequences
`disable_formatting`	boolean	no	false	Return raw upstream response with no formatting wrappers
`enable_thinking`	boolean	no	true	Reason step-by-step before answering
`thinking_budget`	number	no	`32768`	Tokens reserved for thinking · Range: 1 – 393216

Live machine-readable schema is also available at GET https://api.empiriolabs.ai/v1/models/qwen3-max-thinking.