Qwen3.7 Plus | EmpirioLabs AI Docs

POST /v1/chat/completions

Cost-effective Qwen3.7 vision-language model for text, image, video, coding, tool use, GUI understanding, and 1M-context workflows.

At a glance

Field	Value
Model id	`qwen3-7-plus`
Model release date	2026-06-01
Input modalities	Text, Image, Video
Output modalities	Text
Context window	1M
Weight precision	-
Max output tokens	65,536
Region	Singapore
Features	reasoning, vision, video, web_search, code_interpreter, function_calling, prefix_continuation, fine_tuning, agentic_coding
Native inference	No
New	Yes
Structured output	JSON Mode
Supported endpoints	`POST /v1/chat/completions`, `POST /v1/responses`, `POST /v1/messages`, `POST /v1beta/models/qwen3-7-plus:generateContent`
Alternate model ids	`qwen3.7-plus`

Pricing

Charge	Spec	Rate
Input	per 1M prompt tokens	<=256K $0.40; 256K-1M $1.20
Output	per 1M generated tokens	<=256K $1.60; 256K-1M $4.80
Web search	per request when enabled	$0.03
Image Search	per call	$0.03

Example request

$ curl https://api.empiriolabs.ai/v1/chat/completions \
>   -H 'Authorization: Bearer $EMPIRIOLABS_API_KEY' \
>   -H 'Content-Type: application/json' \
>   -d '{"model": "qwen3-7-plus", "messages": [{"role":"user","content":"Hello"}]}'

Parameters

Parameter	Type	Required	Default	Description
`temperature`	number	no	`0.7`	Sampling temperature. 0 is deterministic and 2 is maximum randomness. · Range: 0 – 2
`top_p`	number	no	`0.9`	Nucleus sampling probability mass. Lower values make outputs more focused. · Range: 0 – 1
`max_tokens`	number	no	`4096`	Maximum output tokens. · Range: 1 – 65536
`stop`	string	no	-	Up to 4 strings where the model will stop generating further tokens.
`enable_thinking`	boolean	no	true	Enable reasoning before answering.
`reasoning_effort`	enum	no	`"medium"`	Reasoning effort level. none disables thinking. low, medium, high, and max set bounded thinking budgets sized to the selected model. · Allowed: `none`, `low`, `medium`, `high`, `max`
`thinking_budget`	number	no	`32768`	Maximum tokens reserved for reasoning when thinking is enabled. · Range: 1 – 256000
`vl_high_resolution_images`	boolean	no	true	Use higher resolution processing for image inputs.
`max_pixels`	number	no	`2621440`	Maximum pixel count per image when high resolution processing is disabled. · Range: 4096 – 16777216
`video_fps`	number	no	`2`	Frames per second to sample from video inputs. · Range: 0.1 – 10
`treat_images_as_video`	boolean	no	false	Treat a sequence of images as video frames.
`tool_web_search`	boolean	no	true	Search the web for real-time information. Adds $0.03 to the request cost for each invoked call.
`tool_web_extractor`	boolean	no	true	Extract and read content from URLs. Requires Web Search and Thinking.
`tool_code_interpreter`	boolean	no	true	Run Python code in a sandbox. Requires Thinking.
`tool_web_search_image`	boolean	no	true	Search the web for images from text descriptions. Adds $0.03 to the request cost for each invoked call.
`tool_image_search`	boolean	no	true	Find similar images from an uploaded image. Adds $0.03 to the request cost for each invoked call.
`response_format`	enum	no	-	Return the output as a valid JSON object (JSON mode). Describe the fields you want in your prompt.
`disable_formatting`	boolean	no	false	Return raw provider-style output without EmpirioLabs source formatting where supported.

Notes

Pricing is 3x for input/output above 256K tokens. Web Search, Text-to-Image Search, and Image-to-Image Search are billed only when invoked.

Text-to-Image Search and Image-to-Image Search use the Image Search pricing row. Thinking tokens are billed as output tokens.

Per-tool billing (usage.tool_usage)

When this model invokes built-in tools inside a single request, the response carries a normalized usage.tool_usage map alongside the token counts. Tool counts are already factored into cost_usd and are surfaced for transparency.

Variants

`:variant1`

Field	Value
Model id	`qwen3-7-plus:variant1`
Model release date	2026-06-01
Region	China
Context window	1M
Weight precision	-
Max output tokens	65,536
Features	qwen3.7, reasoning, vision, video, web_search, code_interpreter, function_calling, prefix_continuation, cache, fine_tuning, agentic_coding
Native inference	No
Structured output	JSON Mode
Supported endpoints	`POST /v1/chat/completions`, `POST /v1/responses`, `POST /v1/messages`, `POST /v1beta/models/qwen3-7-plus:variant1:generateContent`

Pricing

Charge	Spec	Rate
Input	per 1M prompt tokens	<=256K $0.276 (was $0.40); 256K-1M $0.826 (was $1.20)
Output	per 1M generated tokens	<=256K $1.101 (was $1.60); 256K-1M $3.301 (was $4.80)
Implicit cache read	per 1M cached prompt tokens	<=256K $0.056 (was $0.08); 256K-1M $0.166 (was $0.24)
Web search	per request when enabled	$0.01
Image Search	per call	$0.01

Parameters

Parameter	Type	Required	Default	Description
`temperature`	number	no	`0.7`	Sampling temperature. 0 is deterministic and 2 is maximum randomness. · Range: 0 – 2
`top_p`	number	no	`0.9`	Nucleus sampling probability mass. Lower values make outputs more focused. · Range: 0 – 1
`max_tokens`	number	no	`4096`	Maximum output tokens. · Range: 1 – 65536
`stop`	string	no	-	Up to 4 strings where the model will stop generating further tokens.
`enable_thinking`	boolean	no	true	Enable reasoning before answering.
`reasoning_effort`	enum	no	`"medium"`	Reasoning effort level. none disables thinking. low, medium, high, and max set bounded thinking budgets sized to the selected model. · Allowed: `none`, `low`, `medium`, `high`, `max`
`thinking_budget`	number	no	`32768`	Maximum tokens reserved for reasoning when thinking is enabled. · Range: 1 – 256000
`vl_high_resolution_images`	boolean	no	true	Use higher resolution processing for image inputs.
`max_pixels`	number	no	`2621440`	Maximum pixel count per image when high resolution processing is disabled. · Range: 4096 – 16777216
`video_fps`	number	no	`2`	Frames per second to sample from video inputs. · Range: 0.1 – 10
`treat_images_as_video`	boolean	no	false	Treat a sequence of images as video frames.
`tool_web_search`	boolean	no	true	Search the web for real-time information. Adds $0.01 to the request cost for each invoked call.
`tool_web_extractor`	boolean	no	true	Extract and read content from URLs. Requires Web Search and Thinking.
`tool_code_interpreter`	boolean	no	true	Run Python code in a sandbox. Requires Thinking.
`tool_web_search_image`	boolean	no	true	Search the web for images from text descriptions. Adds $0.01 to the request cost for each invoked call.
`tool_image_search`	boolean	no	true	Find similar images from an uploaded image. Adds $0.01 to the request cost for each invoked call.
`response_format`	enum	no	-	Return the output as a valid JSON object (JSON mode). Describe the fields you want in your prompt.
`disable_formatting`	boolean	no	false	Return raw provider-style output without EmpirioLabs source formatting where supported.

Notes

China pricing is discounted versus Singapore. Implicit cache input uses the cached-token row. Web Search, Text-to-Image Search, and Image-to-Image Search are billed only when invoked.

Text-to-Image Search and Image-to-Image Search use the Image Search pricing row. China paid tool calls are $0.01 each. Thinking tokens are billed as output tokens.

Per-tool billing (usage.tool_usage)

Machine-readable schema: GET https://api.empiriolabs.ai/v1/models/qwen3-7-plus.