Seed 2.0 Mini

Seed 2.0 Mini
ByteDance · Text Generation
POST /v1/chat/completions

Latency-focused multimodal model with 256K context, four reasoning effort modes, and image/video understanding for high-concurrency use.

At a glance

FieldValue
Model idseed-2-0-mini
Input modalitiesText, Image, Video, Document
Output modalitiesText
Context window256K
Weight precision-
Max output tokens128,000
RegionMalaysia
Featuresvision, reasoning
Native inferenceNo
NewNo
Supported endpointsPOST /v1/chat/completions, POST /v1/responses, POST /v1/messages

Pricing

ChargeSpecRate
Inputper 1M prompt tokens<=128K $0.12; 128K-256K $0.24
Outputper 1M generated tokens<=128K $0.50; 128K-256K $1.00

Example request

$curl https://api.empiriolabs.ai/v1/chat/completions \
> -H 'Authorization: Bearer $EMPIRIOLABS_API_KEY' \
> -H 'Content-Type: application/json' \
> -d '{"model": "seed-2-0-mini", "messages": [{"role":"user","content":"Hello"}]}'

Parameters

ParameterTypeRequiredDefaultDescription
temperaturenumberno0.7Sampling temperature · Range: 0 – 2
top_pnumberno1.0Nucleus sampling · Range: 0 – 1
max_tokensnumberno4096Max output tokens · Range: 1 – 65536
frequency_penaltynumberno0Penalty for repeated tokens. >0 reduces repetition, <0 encourages it. · Range: -2 – 2
presence_penaltynumberno0Penalty for new vs. seen tokens. >0 encourages new topics, <0 encourages staying on topic. · Range: -2 – 2
stopstringno-Comma-separated stop sequences
enable_thinkingbooleannotrueEnable deep thinking / reasoning mode.
reasoning_effortenumno"medium"Reasoning effort tier. Use enable_thinking=false to disable reasoning entirely. · Allowed: low, medium, high
enable_web_searchbooleannofalseEnable BytePlus Ark MCP web search.
enable_cachingbooleannofalseCache the prompt prefix for ~10 min so follow-up requests reuse it and pay fewer input tokens for the cached portion.
image_detailenumno"high"Image visual quality tier for vision input. · Allowed: low, high, xhigh
video_fpsnumberno-Frames per second extracted from video input. · Range: 0.2 – 5

Notes

Pricing is 2x when input tokens >=128K. Temperature and top_p are server-fixed (temp=1, top_p=0.95) regardless of client value.

Per-tool billing (usage.tool_usage)

When this model invokes built-in tools (web search, code interpreter, etc.) inside a single request, the response carries a normalized usage.tool_usage map alongside the token counts. The example below shows the shape — exact field names, units, and which tools appear can vary slightly per provider:

1"usage": {
2 "prompt_tokens": 123,
3 "completion_tokens": 456,
4 "cost_usd": 0.0042,
5 "tool_usage": {"web_search": 3, "code_interpreter": 1}
6}

The tool counts are already factored into cost_usd — they are surfaced for transparency so you can audit per-tool billing. The field is omitted when no tools were invoked.


Machine-readable schema: GET https://api.empiriolabs.ai/v1/models/seed-2-0-mini.