Qwen3.5 122B-A10B | EmpirioLabs AI Docs

POST /v1/chat/completions

Qwen3.5 122B-A10B is a multimodal reasoning model with 256K context, efficient sparse MoE inference, and text, image, and video input.

At a glance

Field	Value
Model id	`qwen3-5-122b-a10b`
Model release date	2026-02-24
Input modalities	Text, Image, Video
Output modalities	Text
Context window	256K
Weight precision	-
Max output tokens	64,000
Region	China
Features	reasoning, vision, web_search, function_calling, multimodal
Native inference	No
New	Yes
Structured output	JSON Mode
Supported endpoints	`POST /v1/chat/completions`, `POST /v1/responses`, `POST /v1/messages`, `POST /v1beta/models/qwen3-5-122b-a10b:generateContent`

Pricing

Charge	Spec	Rate
Input	per 1M prompt tokens	<=128K $0.115 (was $0.40); 128K-256K $0.287 (was $0.40)
Output	per 1M generated tokens	<=128K $0.917 (was $3.20); 128K-256K $2.294 (was $3.20)
Web search	per request when enabled	$0.01

Example request

$ curl https://api.empiriolabs.ai/v1/chat/completions \
>   -H 'Authorization: Bearer $EMPIRIOLABS_API_KEY' \
>   -H 'Content-Type: application/json' \
>   -d '{"model": "qwen3-5-122b-a10b", "messages": [{"role":"user","content":"Hello"}]}'

Parameters

Parameter	Type	Required	Default	Description
`temperature`	number	no	`0.7`	Sampling temperature. 0 is deterministic and 2 is maximum randomness. · Range: 0 – 2
`top_p`	number	no	`0.9`	Nucleus sampling probability mass. Lower values make outputs more focused. · Range: 0 – 1
`max_tokens`	number	no	`4096`	Maximum output tokens. · Range: 1 – 64000
`stop`	string	no	-	Up to 4 strings where the model will stop generating further tokens.
`enable_thinking`	boolean	no	true	Enable reasoning before answering.
`thinking_budget`	number	no	`32768`	Maximum tokens reserved for reasoning when thinking is enabled. · Range: 1 – 80000
`reasoning_effort`	enum	no	`"medium"`	Reasoning effort level. none disables thinking. low, medium, high, and max set bounded thinking budgets sized to the selected model. Sent as an OpenAI-style reasoning_effort field, translated into enable_thinking and thinking_budget for the model service. · Allowed: `none`, `low`, `medium`, `high`, `max`
`vl_high_resolution_images`	boolean	no	true	Use higher resolution processing for image inputs.
`max_pixels`	number	no	`2621440`	Maximum pixel count per image when high resolution processing is disabled. · Range: 4096 – 16777216
`video_fps`	number	no	`2`	Frames per second to sample from video inputs. · Range: 0.1 – 10
`tool_web_search`	boolean	no	false	Search the web for real-time information. Adds $0.01 to the request cost when enabled.
`response_format`	enum	no	-	Return the output as a valid JSON object (JSON mode). Describe the fields you want in your prompt.

Notes

Supports text, image, and video input. Web search is available through tool_web_search and adds $0.01 per request when enabled. Thinking tokens are billed as output tokens.

Machine-readable schema: GET https://api.empiriolabs.ai/v1/models/qwen3-5-122b-a10b.