Qwen3.5 122B-A10B

Qwen3.5 122B-A10B
Alibaba Cloud · Text Generation
POST /v1/chat/completions

Qwen3.5 122B-A10B is a multimodal reasoning model with 256K context, efficient sparse MoE inference, and text, image, and video input.

At a glance

FieldValue
Model idqwen3-5-122b-a10b
Input modalitiesText, Image, Video
Output modalitiesText
Context window256K
Weight precision-
Max output tokens64,000
RegionChina
Featuresreasoning, vision, web_search, function_calling, structured_output, multimodal
Native inferenceNo
NewYes
Supported endpointsPOST /v1/chat/completions, POST /v1/responses, POST /v1/messages

Pricing

ChargeSpecRate
Inputper 1M prompt tokens<=128K $0.115 (was $0.40); 128K-256K $0.287 (was $0.40)
Outputper 1M generated tokens<=128K $0.917 (was $3.20); 128K-256K $2.294 (was $3.20)
Web searchper request when enabled$0.01

Example request

$curl https://api.empiriolabs.ai/v1/chat/completions \
> -H 'Authorization: Bearer $EMPIRIOLABS_API_KEY' \
> -H 'Content-Type: application/json' \
> -d '{"model": "qwen3-5-122b-a10b", "messages": [{"role":"user","content":"Hello"}]}'

Parameters

ParameterTypeRequiredDefaultDescription
temperaturenumberno0.7Sampling temperature. 0 is deterministic and 2 is maximum randomness. · Range: 0 – 2
top_pnumberno0.9Nucleus sampling probability mass. Lower values make outputs more focused. · Range: 0 – 1
max_tokensnumberno4096Maximum output tokens. · Range: 1 – 64000
stopstringnoUp to 4 strings where the model will stop generating further tokens.
enable_thinkingbooleannotrueEnable reasoning before answering.
thinking_budgetnumberno32768Maximum tokens reserved for reasoning when thinking is enabled. · Range: 1 – 80000
reasoning_effortenumno"medium"Reasoning effort level. none disables thinking. low, medium, high, and max set bounded thinking budgets sized to the selected model. Sent as an OpenAI-style reasoning_effort field, translated into enable_thinking and thinking_budget for the model service. · Allowed: none, low, medium, high, max
vl_high_resolution_imagesbooleannotrueUse higher resolution processing for image inputs.
max_pixelsnumberno2621440Maximum pixel count per image when high resolution processing is disabled. · Range: 4096 – 16777216
video_fpsnumberno2Frames per second to sample from video inputs. · Range: 0.1 – 10
tool_web_searchbooleannofalseSearch the web for real-time information. Adds $0.01 to the request cost when enabled.

Notes

Supports text, image, and video input. Web search is available through tool_web_search and adds $0.01 per request when enabled. Thinking tokens are billed as output tokens.


Machine-readable schema: GET https://api.empiriolabs.ai/v1/models/qwen3-5-122b-a10b.