Qwen3.7 Max | EmpirioLabs AI Docs

POST /v1/chat/completions

Qwen3.7 Max is a flagship text model for coding, productivity, long-running agents, deep thinking, tools, and 1M-token context.

At a glance

Field	Value
Model id	`qwen3-7-max`
Model release date	2026-05-21
Input modalities	Text
Output modalities	Text
Context window	1M
Weight precision	-
Max output tokens	65,536
Region	Singapore
Features	reasoning, web_search, code_interpreter, function_calling, agentic_coding
Native inference	No
New	Yes
Structured output	JSON Mode
Supported endpoints	`POST /v1/chat/completions`, `POST /v1/responses`, `POST /v1/messages`, `POST /v1beta/models/qwen3-7-max:generateContent`

Pricing

Charge	Spec	Rate
Input	per 1M prompt tokens	$2.50
Output	per 1M generated tokens	$7.50
Web search	per call when invoked	$0.02
Web extractor	per call when invoked	$0.02
Code interpreter	per call when invoked	$0.02

Example request

$ curl https://api.empiriolabs.ai/v1/chat/completions \
>   -H 'Authorization: Bearer $EMPIRIOLABS_API_KEY' \
>   -H 'Content-Type: application/json' \
>   -d '{"model": "qwen3-7-max", "messages": [{"role":"user","content":"Hello"}]}'

Parameters

Parameter	Type	Required	Default	Description
`temperature`	number	no	`0.7`	Sampling temperature. 0 is deterministic and 2 is maximum randomness. · Range: 0 – 2
`top_p`	number	no	`0.9`	Nucleus sampling probability mass. Lower values make outputs more focused. · Range: 0 – 1
`max_tokens`	number	no	`4096`	Maximum output tokens. · Range: 1 – 65536
`stop`	string	no	-	Up to 4 strings where the model will stop generating further tokens.
`enable_thinking`	boolean	no	true	Enable reasoning before answering.
`reasoning_effort`	enum	no	`"medium"`	Reasoning effort level. none disables thinking. low, medium, high, and max set bounded thinking budgets sized to the selected model. · Allowed: `none`, `low`, `medium`, `high`, `max`
`thinking_budget`	number	no	`32768`	Maximum tokens reserved for reasoning when thinking is enabled. · Range: 1 – 64000
`tool_web_search`	boolean	no	false	Search the web for real-time information. Adds $0.02 to the request cost for each invoked web search call.
`tool_web_extractor`	boolean	no	false	Extract and read content from URLs. Requires Web Search and Thinking. Adds $0.02 to the request cost for each invoked web extractor call.
`tool_code_interpreter`	boolean	no	false	Run Python code in a sandbox. Requires Thinking. Adds $0.02 to the request cost for each invoked code interpreter call.
`disable_formatting`	boolean	no	false	Return raw provider-style output without EmpirioLabs source formatting where supported.
`response_format`	enum	no	-	Return the output as a valid JSON object (JSON mode). Describe the fields you want in your prompt.

Notes

Text input only. Web search, web extractor, and code interpreter are optional built-in tools exposed through tool_* parameters. Each built-in tool call adds $0.02 when invoked. Thinking tokens are billed as output tokens.

Per-tool billing (usage.tool_usage)

When this model invokes built-in tools inside a single request, the response carries a normalized usage.tool_usage map alongside the token counts. Tool counts are already factored into cost_usd and are surfaced for transparency.

Variants

`:variant1`

Field	Value
Model id	`qwen3-7-max:variant1`
Model release date	2026-05-21
Region	China
Context window	1M
Weight precision	-
Max output tokens	65,536
Features	reasoning, web_search, code_interpreter, function_calling, agentic_coding
Native inference	No
Structured output	JSON Mode
Batch API	35% off list price
Supported endpoints	`POST /v1/chat/completions`, `POST /v1/responses`, `POST /v1/messages`, `POST /v1beta/models/qwen3-7-max:variant1:generateContent`

Pricing

Charge	Spec	Rate
Input	per 1M prompt tokens	$1.65 (was $2.50)
Output	per 1M generated tokens	$4.951 (was $7.50)
Web search	per call when invoked	$0.01
Web extractor	per call when invoked	$0.01
Code interpreter	per call when invoked	$0.01

Parameters

Parameter	Type	Required	Default	Description
`temperature`	number	no	`0.7`	Sampling temperature. 0 is deterministic and 2 is maximum randomness. · Range: 0 – 2
`top_p`	number	no	`0.9`	Nucleus sampling probability mass. Lower values make outputs more focused. · Range: 0 – 1
`max_tokens`	number	no	`4096`	Maximum output tokens. · Range: 1 – 65536
`stop`	string	no	-	Up to 4 strings where the model will stop generating further tokens.
`enable_thinking`	boolean	no	true	Enable reasoning before answering.
`reasoning_effort`	enum	no	`"medium"`	Reasoning effort level. none disables thinking. low, medium, high, and max set bounded thinking budgets sized to the selected model. · Allowed: `none`, `low`, `medium`, `high`, `max`
`thinking_budget`	number	no	`32768`	Maximum tokens reserved for reasoning when thinking is enabled. · Range: 1 – 64000
`tool_web_search`	boolean	no	false	Search the web for real-time information. Adds $0.01 to the request cost for each invoked web search call.
`tool_web_extractor`	boolean	no	false	Extract and read content from URLs. Requires Web Search and Thinking. Adds $0.01 to the request cost for each invoked web extractor call.
`tool_code_interpreter`	boolean	no	false	Run Python code in a sandbox. Requires Thinking. Adds $0.01 to the request cost for each invoked code interpreter call.
`disable_formatting`	boolean	no	false	Return raw provider-style output without EmpirioLabs source formatting where supported.
`response_format`	enum	no	-	Return the output as a valid JSON object (JSON mode). Describe the fields you want in your prompt.

Notes

Text input only. Web search, web extractor, and code interpreter are optional built-in tools exposed through tool_* parameters. Each built-in tool call adds $0.01 when invoked. Thinking tokens are billed as output tokens.

Per-tool billing (usage.tool_usage)\n\nWhen this model invokes built-in tools inside a single request, the response carries a normalized usage.tool_usage map alongside the token counts. Tool counts are already factored into cost_usd and are surfaced for transparency.

Machine-readable schema: GET https://api.empiriolabs.ai/v1/models/qwen3-7-max.