Kimi K2.6 | EmpirioLabs AI Docs

Moonshot AI · Text Generation

POST /v1/chat/completions

Kimi K2.6 is a Moonshot multimodal reasoning model with 256K context, strong coding, and text, image, and video inputs.

At a glance

Field	Value
Model id	`kimi-k2-6`
Model release date	2026-04-20
Input modalities	Text, Image, Video
Output modalities	Text
Context window	256K
Weight precision	-
Max output tokens	16,000
Region	China
Features	reasoning, function_calling, cache, multimodal
Native inference	No
New	Yes
Structured output	JSON Schema
Supported endpoints	`POST /v1/chat/completions`, `POST /v1/responses`, `POST /v1/messages`, `POST /v1beta/models/kimi-k2-6:generateContent`

Pricing

Charge	Spec	Rate
Input	per 1M prompt tokens	$0.8939 (was $0.95)
Output	per 1M generated tokens	$3.7131 (was $4.00)
Implicit cache read	per 1M cached input tokens	$0.1788
Web search	per request when enabled	$0.013

Example request

$ curl https://api.empiriolabs.ai/v1/chat/completions \
>   -H 'Authorization: Bearer $EMPIRIOLABS_API_KEY' \
>   -H 'Content-Type: application/json' \
>   -d '{"model": "kimi-k2-6", "messages": [{"role":"user","content":"Hello"}]}'

Parameters

Parameter	Type	Required	Default	Description
`temperature`	number	no	`0.7`	Sampling temperature. 0 = deterministic, 2 = maximum randomness. · Range: 0 – 2
`top_p`	number	no	`0.9`	Nucleus sampling probability mass. Lower values make outputs more focused. · Range: 0 – 1
`max_tokens`	number	no	`4096`	Maximum output tokens. · Range: 1 – 16000
`stop`	string	no	-	Up to 4 strings where the model will stop generating further tokens.
`enable_thinking`	boolean	no	true	Enable reasoning before answering.
`thinking_budget`	number	no	`32768`	Maximum tokens reserved for reasoning when thinking is enabled. · Range: 1 – 81920
`reasoning_effort`	enum	no	`"medium"`	Reasoning effort level. none disables thinking. low, medium, high, and max set bounded thinking budgets sized to the selected model. Sent as an OpenAI-style reasoning_effort field, translated into enable_thinking and thinking_budget for the model service. · Allowed: `none`, `low`, `medium`, `high`, `max`
`response_format`	enum	no	-	Constrain the output to JSON. Use JSON mode for any valid JSON object, or JSON schema to force output that matches a schema you provide.
`web_search_linkup`	boolean	no	false	Optional web search powered by Linkup. When enabled, recent web sources are retrieved using your latest user message as the query and provided to the model as additional context. Adds $0.013 per call when invoked on top of the model’s normal token cost. Disabled by default.
`disable_formatting`	boolean	no	false	When enabled, the gateway will not append the “Sources” footer to assistant responses that used Linkup web search. Useful when the model output is piped to another system that expects no decoration.

Machine-readable schema: GET https://api.empiriolabs.ai/v1/models/kimi-k2-6.