Fugu Ultra

Sakana AI · Text Generation
POST /v1/chat/completionsMulti-agent conductor that orchestrates frontier expert models for hard reasoning, coding, and research, with 1M context, image input, and web search.
At a glance
Pricing
Example request
Parameters
Notes
Fugu Ultra is a multi-agent conductor: each request coordinates a pool of expert models and composes their work into a single answer.
Latency and streaming
- Responses can take from a few seconds to a few minutes on complex prompts.
- The full answer is returned all at once when the model finishes, not token by token. Streaming is accepted, but it delivers the complete response at the end rather than streaming tokens as they generate.
- Leave generous max_tokens headroom, since very small limits can truncate or empty the answer.
Capabilities
- Text and image input, with a 1M token context.
- Always-on reasoning. high is the default; xhigh and max are the same maximum effort.
- Function calling, JSON mode, and built-in web search that cites its sources when available (no separate fee).
Billing
- Billed on full token usage, including the orchestration tokens the model uses internally, so even short prompts carry some cost.
- Context-tiered: requests above 272K total input tokens use the higher rate shown.
Machine-readable schema: GET https://api.empiriolabs.ai/v1/models/fugu-ultra.
