Qwen3.6 Max Preview

Qwen3.6 Max Preview
Alibaba Cloud · Text Generation
POST /v1/chat/completions

Largest preview variant in the 3.6 series (text-only): improved coding agent execution, stronger front-end skills, and broader long-tail knowledge.

This model is deprecated and will be retired on 2026-09-08. After that date, requests to this model will fail. Migrate to a successor model before then.

At a glance

FieldValue
Model idqwen3-6-max-preview
Input modalitiesText
Output modalitiesText
Context window256K
Weight precision-
Max output tokens65,536
RegionSingapore
Featuresreasoning, agentic_coding, web_search
Native inferenceNo
NewYes
Supported endpointsPOST /v1/chat/completions, POST /v1/responses, POST /v1/messages
Deprecation date2026-09-08

Pricing

ChargeSpecRate
Inputper 1M prompt tokens<=128K $1.31; 128K-256K $1.97
Outputper 1M generated tokens<=128K $7.88; 128K-256K $11.82
Web Searchper call$0.020

Example request

$curl https://api.empiriolabs.ai/v1/chat/completions \
> -H 'Authorization: Bearer $EMPIRIOLABS_API_KEY' \
> -H 'Content-Type: application/json' \
> -d '{"model": "qwen3-6-max-preview", "messages": [{"role":"user","content":"Hello"}]}'

Parameters

ParameterTypeRequiredDefaultDescription
temperaturenumberno0.7Sampling temperature · Range: 0 – 2
top_pnumberno1.0Nucleus sampling · Range: 0 – 1
max_tokensnumberno4096Max output tokens · Range: 1 – 65536
frequency_penaltynumberno0Penalty for repeated tokens. >0 reduces repetition, <0 encourages it. · Range: -2 – 2
presence_penaltynumberno0Penalty for new vs. seen tokens. >0 encourages new topics, <0 encourages staying on topic. · Range: -2 – 2
reasoning_effortenumno"medium"Reasoning effort level. none disables thinking. low, medium, high, and max set bounded thinking budgets sized to the selected model. Sent as an OpenAI-style reasoning_effort field, translated into enable_thinking and thinking_budget for the model service. · Allowed: none, low, medium, high, max
stopstringno-Comma-separated stop sequences
enable_thinkingbooleannotrueReason step-by-step before answering
thinking_budgetnumberno32768Tokens reserved for thinking · Range: 1 – 393216
tool_web_searchbooleannofalseSearch the web for real-time information.
disable_formattingbooleannofalseSkip the EmpirioLabs Markdown formatting (citation [N] rewriting + References block when web search / tools were used). The raw upstream answer with plain [N] citations is returned.

Notes

Pricing is ~1.5x above 128K tokens. Plain-text capabilities only in this preview; multimodal not yet enabled.

Per-tool billing (usage.tool_usage)

When this model invokes tools (web search, code interpreter, etc.) inside a single request, the response carries a normalized usage.tool_usage map alongside the token counts. The example below shows the shape — exact field names, units, and which tools appear can vary slightly per provider:

1"usage": {
2 "prompt_tokens": 123,
3 "completion_tokens": 456,
4 "cost_usd": 0.0042,
5 "tool_usage": {"web_search": 3, "code_interpreter": 1}
6}

The tool counts are already factored into cost_usd — they are surfaced for transparency so you can audit per-tool billing. The field is omitted when no tools were invoked.


Machine-readable schema: GET https://api.empiriolabs.ai/v1/models/qwen3-6-max-preview.