Qwen3.5-Flash

Qwen3.5-Flash

Provider: Alibaba Cloud
Category: Text Generation
Endpoint: POST /v1/chat/completions
Context window: 1M
Served from: Singapore

Vision-language model with hybrid linear-attention plus sparse MoE, 1M context, and fast multimodal text/image/video inference.

At a glance

FieldValue
Model idqwen3-5-flash
Input modalitiestext, image, video
Output modalitiestext
Context window1M
RegionSingapore
Featuresvision, web_search, code_interpreter, function_calling
NewNo
Native inferenceNo

Pricing

ChargeSpecRate
Inputper 1M tokens0.090(was0.090 (was 0.10)
Outputper 1M tokens0.368(was0.368 (was 0.40)
Web Searchper call$0.015
Image Searchper call$0.012

Example request

$curl https://api.empiriolabs.ai/v1/chat/completions \
> -H 'Authorization: Bearer $EMPIRIOLABS_API_KEY' \
> -H 'Content-Type: application/json' \
> -d '{"model": "qwen3-5-flash", "messages": [{"role":"user","content":"Hello"}]}'

Parameters

ParameterTypeRequiredDefaultDescription
temperaturenumberno0.7Sampling temperature · Range: 0 – 2
top_pnumberno1Nucleus sampling · Range: 0 – 1
max_tokensnumberno4096Max output tokens · Range: 1 – 65536
frequency_penaltynumberno0Range: -2 – 2
presence_penaltynumberno0Range: -2 – 2
streambooleannofalseServer-Sent Events streaming
stopstringnoComma-separated stop sequences
disable_formattingbooleannofalseReturn raw upstream response with no formatting wrappers

Live machine-readable schema is also available at GET https://api.empiriolabs.ai/v1/models/qwen3-5-flash.