> For clean Markdown of any page, append .md to the page URL.
> For a complete documentation index, see https://docs.empiriolabs.ai/llms.txt.
> For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://docs.empiriolabs.ai/_mcp/server.

# AI Agent API Reference Context

# EmpirioLabs AI — API Reference bundle

Full OpenAPI 3.1 spec plus the API Reference overview page.
Auto-generated on every docs build.

***

## API Reference Overview

EmpirioLabs speaks **OpenAI- and Anthropic-compatible** request shapes. Drop in any SDK, point it at `https://api.empiriolabs.ai`, and authenticate with your EmpirioLabs API key. Every endpoint below works against any OpenAI or Anthropic client unchanged.

## Authentication

Every request requires a bearer token. Either header is accepted on every endpoint:

```
Authorization: Bearer $EMPIRIOLABS_API_KEY
x-api-key: $EMPIRIOLABS_API_KEY
```

```python title="Python (OpenAI SDK)"
from openai import OpenAI

client = OpenAI(
    base_url="https://api.empiriolabs.ai",
    api_key="$EMPIRIOLABS_API_KEY",
)

resp = client.chat.completions.create(
    model="deepseek-v4-pro",
    messages=[{"role": "user", "content": "Hello!"}],
)
```

```typescript title="TypeScript (OpenAI SDK)"
import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.empiriolabs.ai",
  apiKey: process.env.EMPIRIOLABS_API_KEY,
});

const resp = await client.chat.completions.create({
  model: "deepseek-v4-pro",
  messages: [{ role: "user", content: "Hello!" }],
});
```

```typescript title="Anthropic SDK"
import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic({
  baseURL: "https://api.empiriolabs.ai",
  apiKey: process.env.EMPIRIOLABS_API_KEY,
});

const msg = await client.messages.create({
  model: "deepseek-v4-pro",
  max_tokens: 1024,
  messages: [{ role: "user", content: "Hello!" }],
});
```

```bash title="cURL"
curl https://api.empiriolabs.ai/v1/chat/completions \
  -H "Authorization: Bearer $EMPIRIOLABS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-v4-pro",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'
```

## Endpoint surface

OpenAI-compatible chat. Streaming, tool calling, vision, audio input, JSON mode, structured output, reasoning controls.

OpenAI-compatible prompt completions for models that advertise `POST /v1/completions`.

Drop-in for Anthropic SDK clients. tool\_use / tool\_result blocks round-trip cleanly.

Generate, edit, inpaint, image variations. Hosted CDN URLs, 7-day signed.

Async video generation. Returns a job\_id; poll the jobs endpoint for the URL.

TTS plus real-time streaming TTS (Inworld), music / podcast / SFX generation, voice clone management.

Long-running tool-using agent tasks. Start, poll, stream messages, stop early.

Whisper / Deepgram / Parakeet. Multipart upload or file\_url.

Exa, Tavily, Linkup, Perplexity Search. Domain filters, date ranges, geo bias.

Async image-to-3D asset generation. Returns a job\_id; poll for the signed GLB URL.

`POST /v1/detect` — GPTZero AI-detection, bibliography scan, source analysis.

OpenAI-compatible embeddings. Multilingual text + multimodal embedders.

Semantic document reranking. Sort retrieval candidates by relevance for RAG and search refinement.

Pass any public URL on input fields. No upload, no re-sign — generated outputs are valid for 7 days.

Poll the status / result of any async generation. State retained 1 hour after completion.

Live catalog with pricing, parameter schema, capability flags, regions.

OpenAI- and Anthropic-compatible error envelopes.

## Chat completions

`POST /v1/chat/completions`

Pass any chat-capable model from the catalog as `model`. Streaming uses Server-Sent Events with `data: ...` lines and a final `data: [DONE]`.

```bash title="Streaming completion"
curl https://api.empiriolabs.ai/v1/chat/completions \
  -H "Authorization: Bearer $EMPIRIOLABS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-v4-pro",
    "messages": [{"role": "user", "content": "Summarize CRDT consistency in 3 bullets."}],
    "stream": true,
    "temperature": 0.7
  }'
```

```bash title="Tool calling"
curl https://api.empiriolabs.ai/v1/chat/completions \
  -H "Authorization: Bearer $EMPIRIOLABS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "mistral-small-4",
    "messages": [{"role": "user", "content": "What is the weather in Paris?"}],
    "tools": [{
      "type": "function",
      "function": {
        "name": "get_weather",
        "parameters": {
          "type": "object",
          "properties": {"city": {"type": "string"}},
          "required": ["city"]
        }
      }
    }],
    "tool_choice": "auto"
  }'
```

```bash title="Vision input"
curl https://api.empiriolabs.ai/v1/chat/completions \
  -H "Authorization: Bearer $EMPIRIOLABS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3-5-omni-plus",
    "messages": [{
      "role": "user",
      "content": [
        {"type": "text", "text": "What is in this image?"},
        {"type": "image_url", "image_url": {"url": "https://example.com/photo.jpg"}}
      ]
    }]
  }'
```

```bash title="JSON mode"
curl https://api.empiriolabs.ai/v1/chat/completions \
  -H "Authorization: Bearer $EMPIRIOLABS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-v4-pro",
    "messages": [{"role": "user", "content": "Extract person from: John Doe, 32, NYC"}],
    "response_format": {"type": "json_object"}
  }'
```

Every model's accepted parameters live on its docs page (e.g. `temperature`, `top_p`, `enable_thinking`, `reasoning_effort`, `web_search_tier`). Browse them under [Providers and Models](/providers).

## Model parameters across endpoints

Model-specific parameters advertised on the model page and in `GET /v1/models/{id}` can be sent to `/v1/chat/completions`, `/v1/responses`, and `/v1/messages` when that model supports the endpoint. The gateway adapts request shapes so the same controls reach the underlying model.

For thinking-capable models, `enable_thinking` and `thinking_budget` are accepted on all three text endpoints. On `/v1/messages`, you can also use Anthropic-style thinking:

```json
{
  "thinking": {
    "type": "enabled",
    "budget_tokens": 1024
  }
}
```

That maps to the same `enable_thinking=true` and `thinking_budget=1024` controls used by Chat Completions and Responses.

## Legacy completions

`POST /v1/completions`

Use this endpoint for OpenAI-compatible clients that still send a raw `prompt` instead of chat `messages`. Only models that list `POST /v1/completions` in `supported_endpoints` accept this shape.

Streaming uses Server-Sent Events and includes usage when the model service reports it.

```bash
curl https://api.empiriolabs.ai/v1/completions \
  -H "Authorization: Bearer $EMPIRIOLABS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3-5-9b",
    "prompt": "Write one concise launch sentence.",
    "max_tokens": 64,
    "stream": true
  }'
```

## Anthropic Messages

`POST /v1/messages`

Drop-in for any Anthropic SDK client — the same models accessible on `/v1/chat/completions` and `/v1/responses` are reachable here under the Anthropic Messages shape.

```bash
curl https://api.empiriolabs.ai/v1/messages \
  -H "x-api-key: $EMPIRIOLABS_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-v4-pro",
    "max_tokens": 1024,
    "messages": [{"role": "user", "content": "Hi!"}]
  }'
```

`tool_use` and `tool_result` blocks round-trip cleanly. Mixed text-plus-tool\_use content arrays are preserved.

## Image generation

`POST /v1/images/generations`

```bash
curl https://api.empiriolabs.ai/v1/images/generations \
  -H "Authorization: Bearer $EMPIRIOLABS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "wan-2-7-image",
    "prompt": "A glass cathedral at sunset, dramatic lighting",
    "aspect_ratio": "16:9",
    "resolution": "4K",
    "thinking_mode": true,
    "num_images": 4
  }'
```

Image-edit flows accept `image: ["https://..."]` with up to the model's documented limit (3 for `qwen-image-2-0`, 9 for `wan-2-7-image`, 14 for `seedream-5-0-lite`). Image-set modes generate cohesive series — see each model's page for the toggle.

Returned URLs live on `https://media.empiriolabs.ai` and expire after **7 days**. Save anything you want to keep before the URL expires.

`POST /v1/images/analysis` runs vision-only analysis (no generation) on one or more input images. Use for layout extraction, object detection, OCR, and similar inspection tasks where the model returns text or JSON describing the image rather than a new picture.

## Video generation

`POST /v1/videos/generations`

Always async — the endpoint returns a `job_id` and a polling URL.

```bash title="Submit a job"
curl https://api.empiriolabs.ai/v1/videos/generations \
  -H "Authorization: Bearer $EMPIRIOLABS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "seedance-2-0-pro",
    "prompt": "A cinematic dolly shot of a city street at dusk in the rain",
    "resolution": "1080p",
    "aspect_ratio": "16:9",
    "duration": 8,
    "generate_audio": true
  }'
```

```json title="Initial response"
{
  "job_id": "job_01HV3K...",
  "status": "processing",
  "poll_url": "/v1/jobs/job_01HV3K..."
}
```

```bash title="Poll for result"
curl https://api.empiriolabs.ai/v1/jobs/job_01HV3K... \
  -H "Authorization: Bearer $EMPIRIOLABS_API_KEY"
```

```json title="Successful result"
{
  "job_id": "job_01HV3K...",
  "status": "completed",
  "progress": 1.0,
  "result": {
    "data": [{ "url": "https://media.empiriolabs.ai/worker-outputs/..." }]
  }
}
```

## Audio generation

`POST /v1/audio/speech` synchronous, returns a hosted URL by default; pass `response_format: "b64_json"` for inline audio bytes.

`POST /v1/audio/speech:stream` real-time TTS. Returns Server-Sent Events as the model synthesizes. Sub-130ms time-to-first-byte on Inworld TTS Mini, sub-250ms on Max. Use for voice agents and interactive playback. Currently supported on Inworld TTS Mini / Max; other TTS models use the synchronous endpoint.

`POST /v1/audio/generations` music, podcast, and sound-effect generation. Covers Stable Audio, GLM TTS, MOSS, SoulX Podcast where the prompt-to-audio shape differs from TTS.

`GET /v1/voices` list and manage voices, including custom voice clones for Inworld TTS. Use the returned `voice_id` on either speech endpoint.

```bash title="Single voice (Gemini TTS)"
curl https://api.empiriolabs.ai/v1/audio/speech \
  -H "Authorization: Bearer $EMPIRIOLABS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini-2-5-flash-tts",
    "input": "Hello from EmpirioLabs.",
    "voice": "Charon",
    "output_format": "WAV"
  }'
```

```bash title="Streaming TTS (Inworld)"
curl -N https://api.empiriolabs.ai/v1/audio/speech:stream \
  -H "Authorization: Bearer $EMPIRIOLABS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "inworld-tts-mini",
    "input": "Streaming audio arrives in real time.",
    "voice": "Ashley",
    "output_format": "mp3"
  }'
```

```bash title="Multi-speaker podcast"
curl https://api.empiriolabs.ai/v1/audio/speech \
  -H "Authorization: Bearer $EMPIRIOLABS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "soulx-podcast",
    "input": "[S1] Welcome to the show. [S2] Glad to be here. [S1] Lets dive in.",
    "voice_s1": "arthur",
    "voice_s2": "lj",
    "output_format": "mp3"
  }'
```

```bash title="Music generation"
curl https://api.empiriolabs.ai/v1/audio/generations \
  -H "Authorization: Bearer $EMPIRIOLABS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "stable-audio-2-5",
    "prompt": "Lo-fi hip hop, mellow piano, gentle vinyl crackle, 90 BPM",
    "duration": 60,
    "steps": 8,
    "cfg_scale": 1
  }'
```

## Transcription

`POST /v1/audio/transcriptions`

Accepts either a multipart `file` upload or a JSON payload with `file_url`.

```bash title="Multipart upload"
curl https://api.empiriolabs.ai/v1/audio/transcriptions \
  -H "Authorization: Bearer $EMPIRIOLABS_API_KEY" \
  -F "model=openai-whisper-1" \
  -F "file=@meeting.mp3" \
  -F "response_format=verbose_json" \
  -F "timestamp_granularities=word,segment"
```

```bash title="From URL"
curl https://api.empiriolabs.ai/v1/audio/transcriptions \
  -H "Authorization: Bearer $EMPIRIOLABS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepgram-nova-3",
    "file_url": "https://example.com/recording.wav",
    "diarize": true,
    "smart_format": true
  }'
```

Long files (over 5 minutes) auto-route to the async job system — the response includes a `job_id` instead of inline text. Poll the jobs endpoint to retrieve the final transcript.

## Search and research

`POST /v1/search` unified search surface for retrieval-style models. The exact accepted params per model live on each model's page (e.g. `exa-search` exposes 28 params including `category`, `livecrawl`, `subpages`, `summary_query`, `code_tokens`).

`POST /v1/research` deep research / multi-step retrieval models (Exa Research, Perplexity Deep Research, Linkup Deep Search). Generates a structured research report with cited sources.

`POST /v1/answer` direct question-answering models (Exa Answer). Returns a concise answer plus citations without the full report shape.

```bash title="Tavily search with crawl"
curl https://api.empiriolabs.ai/v1/search \
  -H "Authorization: Bearer $EMPIRIOLABS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "tavily-search",
    "query": "latest CRDT research papers 2026",
    "search_depth": "advanced",
    "include_answer": "advanced",
    "max_results": 10,
    "topic": "general"
  }'
```

```bash title="Perplexity Sonar with geo + date filter"
curl https://api.empiriolabs.ai/v1/chat/completions \
  -H "Authorization: Bearer $EMPIRIOLABS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "perplexity-sonar-pro",
    "messages": [{"role": "user", "content": "Recent FDA approvals"}],
    "search_context_size": "high",
    "search_recency_filter": "week",
    "country": "US",
    "search_mode": "default"
  }'
```

```bash title="Exa neural search with subpages"
curl https://api.empiriolabs.ai/v1/search \
  -H "Authorization: Bearer $EMPIRIOLABS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "exa-search",
    "query": "alternatives to vector databases",
    "search_type": "neural",
    "category": "research paper",
    "subpages": 3,
    "summary": true
  }'
```

## Agents

Long-running, tool-using agent tasks (currently routed to Manus). Submit once, then poll for status and step-by-step messages, or stop early.

`POST /v1/agents/run` does double duty:

* With no `task_id` it starts a fresh task. The response carries the new `task_id`.
* With `task_id` it sends a follow-up message to an existing task. The agent picks it up on its next reasoning step.

`GET /v1/agents/{task_id}` retrieve the task's current status and final result.

`GET /v1/agents/{task_id}/messages` list every step the agent has emitted so far. Useful for rendering a live reasoning trace alongside the final answer.

`POST /v1/agents/{task_id}/stop` stop a running task. Billing settles for the work the agent already completed.

```bash title="Start a task"
curl https://api.empiriolabs.ai/v1/agents/run \
  -H "Authorization: Bearer $EMPIRIOLABS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "manus",
    "input": "Summarize the top 5 humanoid robotics startups by funding raised in 2025-2026"
  }'
```

```bash title="Poll the task"
curl https://api.empiriolabs.ai/v1/agents/task_01HV3K... \
  -H "Authorization: Bearer $EMPIRIOLABS_API_KEY"
```

```bash title="Stream task messages"
curl https://api.empiriolabs.ai/v1/agents/task_01HV3K.../messages \
  -H "Authorization: Bearer $EMPIRIOLABS_API_KEY"
```

## 3D Generation

`POST /v1/3d/generations`

Image-to-3D generation is async. The endpoint returns a `job_id` and a polling URL; poll the jobs endpoint to retrieve the final signed GLB URL.

```bash title="Submit a 3D job"
curl https://api.empiriolabs.ai/v1/3d/generations \
  -H "Authorization: Bearer $EMPIRIOLABS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "trellis-2-4b",
    "image_url": "https://example.com/product-photo.png",
    "resolution": "1024",
    "texture_size": "2048",
    "decimation_target": 500000,
    "seed": 42
  }'
```

```json title="Initial response"
{
  "job_id": "job_01HV3K...",
  "status": "processing",
  "poll_url": "/v1/jobs/job_01HV3K..."
}
```

```json title="Successful result"
{
  "job_id": "job_01HV3K...",
  "status": "completed",
  "progress": 1.0,
  "result": {
    "data": [{
      "url": "https://media.empiriolabs.ai/worker-outputs/...",
      "content_type": "model/gltf-binary"
    }]
  }
}
```

`trellis-2-4b` exposes the full image, resolution, sampler, texture, and mesh export parameter surface on its model page.

## Detection

`POST /v1/detect`

Specialized text-classification endpoint. Currently powers GPTZero (AI-detection, bibliography scan, source analysis). Each model's `scan_type` enum picks the upstream path; see the per-model docs for the full parameter surface.

```bash title="GPTZero AI-detection"
curl https://api.empiriolabs.ai/v1/detect \
  -H "Authorization: Bearer $EMPIRIOLABS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gptzero",
    "input": "The quick brown fox jumps over the lazy dog.",
    "scan_type": "ai_detection",
    "multilingual": false
  }'
```

```bash title="GPTZero bibliography scan"
curl https://api.empiriolabs.ai/v1/detect \
  -H "Authorization: Bearer $EMPIRIOLABS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gptzero",
    "scan_type": "bibliography",
    "input": "[1] Smith, J. (2024). Title. Journal, 12(3), 45-67."
  }'
```

GPTZero is also reachable via `/v1/chat/completions` and `/v1/responses` — pass the text on the message body and the gateway adapts the call. The detection summary comes back as the assistant message; pass `disable_formatting: true` to receive the raw upstream JSON instead.

## Embeddings

`POST /v1/embeddings`

OpenAI-compatible embeddings. Multilingual text and multimodal (text + image + video) embedders are available.

```bash title="Text embedding"
curl https://api.empiriolabs.ai/v1/embeddings \
  -H "Authorization: Bearer $EMPIRIOLABS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "text-embedding-v4",
    "input": ["Sentence one.", "Sentence two."],
    "encoding_format": "float"
  }'
```

```bash title="Multimodal embedding (text + image)"
curl https://api.empiriolabs.ai/v1/embeddings \
  -H "Authorization: Bearer $EMPIRIOLABS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "tongyi-embedding-vision-plus",
    "input": [
      {"type": "text", "text": "a glass cathedral at sunset"},
      {"type": "image", "url": "https://example.com/photo.jpg"}
    ]
  }'
```

## Reranks

`POST /v1/reranks`

Sort candidate `documents` by semantic relevance to a `query`. Returns each document's original index plus a 0-1 relevance score (higher = more relevant). Use this to tighten the output of a vector store / BM25 / hybrid retriever before passing the top hits to a language model — the standard last step in a RAG pipeline.

```bash title="Rerank a candidate list"
curl https://api.empiriolabs.ai/v1/reranks \
  -H "Authorization: Bearer $EMPIRIOLABS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3-rerank",
    "query": "What is a rerank model?",
    "documents": [
      "Rerank models sort candidate documents by relevance.",
      "Quantum computing is a cutting-edge field of computer science.",
      "Pre-trained language models advanced rerank models."
    ],
    "top_n": 2,
    "return_documents": true
  }'
```

```json title="Response shape"
{
  "output": {
    "results": [
      {"index": 0, "relevance_score": 0.93, "document": {"text": "Rerank models sort candidate documents by relevance."}},
      {"index": 2, "relevance_score": 0.34, "document": {"text": "Pre-trained language models advanced rerank models."}}
    ]
  },
  "usage": {"total_tokens": 79, "cost_usd": 0.0000079},
  "request_id": "85ba5752-..."
}
```

The optional `instruct` parameter swaps between Q\&A retrieval (default) and pure semantic-similarity sorting — see the [qwen3-rerank model page](/models/qwen3-rerank) for the full parameter table.

## Usage object

Every endpoint that bills usage returns a `usage` field on the response (and on the terminal streaming chunk). Base shape:

* `cost_usd` — exact amount your account was billed for the request. Authoritative.
* `prompt_tokens` / `completion_tokens` / `total_tokens` — for chat-style models.
* Cache fields (`cache_read_input_tokens`, `cache_creation_input_tokens`) — when prompt caching applies.

Models with tiered, per-call, or variant-priced upstreams stamp extra fields on `usage` so you can see which rate was applied:

* **Tier / variant pricing.** Workers stamp a tier discriminator on `usage` when the same dimension has more than one rate. The primary field is `pricing_tier_label` (human-readable, e.g. `"Medium context"` / `"Pro"` / `"2K"`). Older workers may stamp the raw dimension directly instead (`resolution`, `quality`, `mode`, `rate_tier`). The dashboard renders the badge from whichever is present.
* **Per-call pricing.** Workers that bill per tool invocation (search, fetch, code execution, etc.) stamp counts under `tool_calls_details.<tool>.invocation` or `tool_usage.<tool>`. The dashboard expands these into a per-tool breakdown automatically.
* **Per-dimension pricing.** Workers that bill multiple dimensions in one request (e.g. citation tokens + reasoning tokens + search queries on deep-research models) stamp each dimension as its own field (`citation_tokens`, `reasoning_tokens`, `num_search_queries`, etc.).

The same fields drive the tier badge and per-tool breakdown on the dashboard usage logs, and they are also returned by the [`GET /v1/account/usage`](/api-reference/api-reference/account/get-account-usage) history endpoint under each event's `metadata.worker_usage` (plus a structured `tool_breakdown` array for per-call models). So whether you read live response usage, account-usage history, or your dashboard, the tier and billing breakdown match exactly.

## File URLs

EmpirioLabs does not host user uploads. Pass any public URL directly on the input field of the model endpoint:

| Endpoint family                     | Input field                                                  | Accepts                                    |
| ----------------------------------- | ------------------------------------------------------------ | ------------------------------------------ |
| Chat completions (vision)           | `content[].image_url.url`                                    | Any public image URL or `data:` URI        |
| Chat completions (audio)            | `content[].audio_url.url`                                    | Any public audio URL                       |
| Image generation (edit / variation) | `image: ["https://..."]`                                     | Up to N URLs (model-specific limit)        |
| Video generation (i2v / r2v / edit) | `image: "https://..."` / `video: "https://..."`              | Public URLs                                |
| Audio TTS / music                   | n/a (text-only input)                                        | —                                          |
| Audio transcription                 | `file_url: "https://..."` **or** multipart `file=@local.mp3` | Public URL or direct upload of short clips |
| Search                              | n/a (query text)                                             | —                                          |
| Embeddings (multimodal)             | `input[].url` (image/video item)                             | Public URLs                                |
| Reranks                             | n/a (text query + text documents)                            | —                                          |

For audio transcription specifically, the multipart-direct upload on `/v1/audio/transcriptions` is the supported path for private clips that aren't on a URL — those bytes flow straight to the speech-to-text worker without persistent storage.

Generated output URLs are signed and expire **7 days** after creation. There is no re-sign endpoint. Save anything you need to keep — both the URL and the binary — within that window.

## Async jobs

`GET /v1/jobs/<job-id>` — poll the status / final result of any async generation or transcription job.

Job state is retained for **1 hour** after completion.

```json title="Job state shape"
{
  "job_id": "job_01HV...",
  "status": "processing | completed | failed",
  "progress": 0.42,
  "created_at": "2026-05-02T17:11:32Z",
  "completed_at": null,
  "result": null,
  "error": null
}
```

When `status` is `completed`, the `result` field carries the full response in the same shape the synchronous endpoint would have returned.

Inbound HTTP timeout is **15 minutes**. Synchronous chat completions running close to that limit should set `stream=true` so partial output flows back and the connection stays warm.

## Models

`GET /v1/models` — list every available model.

`GET /v1/models/<model-id>` — full schema for one model, including its parameter table.

`GET /v1/models?format=openrouter` returns the OpenRouter model-listing shape for models marked ready for partner ingestion. See [OpenRouter Model Listing](/openrouter-model-listing) for the exact response fields.

Each model returns:

| Field                 | Description                                                                            |
| --------------------- | -------------------------------------------------------------------------------------- |
| `id`                  | Canonical slug (e.g. `wan-2-7-image`)                                                  |
| `description`         | Short marketing description                                                            |
| `category`            | text / image / video / audio / transcription / research / tools / embedding / reranker |
| `input_modalities`    | What the model accepts                                                                 |
| `output_modalities`   | What the model emits                                                                   |
| `context_window`      | Tokens (chat) or null (media)                                                          |
| `region`              | Server region                                                                          |
| `logo`                | CDN URL to the model logo                                                              |
| `pricing_rows`        | Per-token, per-image, per-second, or per-call rates                                    |
| `supported_endpoints` | Which `/v1/...` endpoints accept this model                                            |
| `parameters`          | **Full schema** — name, type, default, min/max, allowed values, descriptions           |
| `features`            | Tags like streaming, vision, tool\_calling, voice\_cloning                             |

```bash
curl https://api.empiriolabs.ai/v1/models | jq '.data[0]'
curl https://api.empiriolabs.ai/v1/models/wan-2-7-image | jq '.parameters'
```

## disable\_formatting flag

Many chat, search, research, and rerank endpoints accept a `disable_formatting=true` flag. When set on a supporting model, the worker skips EmpirioLabs server-side formatting (citation rewriting, References block, thinking-block Markdown, etc.) and returns the upstream payload shape verbatim.

Coverage is advertised per-model. Check `supports_passthrough` in `GET /v1/models/{id}` to confirm a specific model honors the flag. Models that advertise `supports_passthrough: true` also accept the aliases `raw=true`, `passthrough=true`, and `raw_response=true`. Models without that field accept only the canonical `disable_formatting=true` form (or do not honor passthrough at all). The model card lists which aliases each model accepts.

Image, video, audio-generation, transcription, and embedding endpoints do not accept this flag, since there is no formatting layer to disable on those endpoints.

## Generated media retention

Generated images, videos, and audio are returned as signed URLs that are valid for **7 days**. After that, the URL stops working and the asset is gone — there is no re-sign endpoint. Save anything you want to keep before the 7-day window expires.

## Errors

OpenAI envelope on chat / responses / images / videos / audio / search / embeddings / reranks:

```json
{
  "error": {
    "message": "Aspect ratio is not supported by this model. Allowed: 16:9, 9:16, 1:1, ...",
    "type": "invalid_request_error",
    "code": "invalid_parameter",
    "param": "aspect_ratio"
  }
}
```

Anthropic envelope on `/v1/messages`:

```json
{ "type": "error", "error": { "type": "invalid_request_error", "message": "..." } }
```

## Headers reference

| Header                        | Required                    | Purpose                                            |
| ----------------------------- | --------------------------- | -------------------------------------------------- |
| `Authorization` / `x-api-key` | yes                         | Bearer token authentication                        |
| `Content-Type`                | yes on POST                 | `application/json` or `multipart/form-data`        |
| `Accept`                      | no                          | `text/event-stream` to force SSE on chat endpoints |
| `anthropic-version`           | when calling `/v1/messages` | Anthropic API version (e.g. `2023-06-01`)          |

Browse the per-model parameter schemas under [Providers and Models](/providers). When you click into a specific model, every parameter the model accepts — type, default, range, allowed values, conditional flags — is documented in a table generated from the live database.

***

## openapi.yaml

```yaml

openapi: 3.1.0
info:
  title: EmpirioLabs AI REST API
  version: 1.0.0
  description: |
    OpenAI- and Anthropic-compatible REST API exposing 89 models from
    29 providers behind a single gateway. Drop in any SDK that targets
    OpenAI or Anthropic, point its `base_url` at
    `https://api.empiriolabs.ai`, and authenticate with your EmpirioLabs
    API key. No SDK changes required.

    The model catalog and per-model parameter schema is the single
    source of truth — fetch it any time at `GET /v1/models` or
    `GET /v1/models/<model-id>`.
  contact:
    name: EmpirioLabs Support
    url: https://empiriolabs.ai
    email: support@empiriolabs.ai
servers:
  - url: https://api.empiriolabs.ai
    description: Production
tags:
  - name: Models
    description: Live catalog with pricing, parameter schema, and capability flags.
  - name: Chat Completions
    description: OpenAI-compatible chat completions. Streaming, tool calling, vision, audio input, JSON mode, structured output.
  - name: Completions
    description: OpenAI-compatible prompt completions for models that advertise this endpoint.
  - name: Responses
    description: OpenAI Responses-compatible endpoint for stateful conversations.
  - name: Messages
    description: Anthropic Messages-compatible endpoint. Drop-in for Anthropic SDK clients.
  - name: Embeddings
    description: OpenAI-compatible embeddings. Batch up to 2048 inputs per call.
  - name: Reranks
    description: Semantic document reranking. Sorts candidate documents by relevance to a query — used to refine retrieval results in RAG and search pipelines.
  - name: Images
    description: Image generation, image edit, inpainting, outpainting, image variations.
  - name: Videos
    description: Async video generation. Returns a job_id; poll the jobs endpoint.
  - name: Generation Templates
    description: Image and video generation templates backed by the live catalog.
  - name: 3D Generation
    x-displayName: 3D Generation
    description: Async image-to-3D asset generation. Returns a job_id; poll the jobs endpoint for GLB output.
  - name: Audio
    description: Text-to-speech, music generation, podcast TTS with voice cloning.
  - name: Transcription
    description: Whisper / Deepgram / Parakeet. Multipart upload or file_url.
  - name: Search
    description: Exa, Tavily, Linkup, Perplexity Search. Domain filters, geo bias, date ranges.
  - name: Detection
    description: Specialized text-classification endpoints (e.g. GPTZero AI-detection, bibliography scan, source analysis).
  - name: Files
    description: Multipart upload of audio / image / video for use as input to other endpoints.
  - name: Account
    description: Account balance, usage history, request costs, and token counts.
  - name: Playground
    description: Read-only access to saved Playground conversations.
  - name: Jobs
    description: Poll, list, or cancel async generations. State retained 1 hour after completion.
  - name: GPU Cloud
    description: Deploy and manage GPU Cloud instances, then connect through the EmpirioLabs API.
  - name: Hosted Agents
    description: Deploy and manage hosted OpenClaw and Hermes Agent instances.
security:
  - bearerAuth: []
paths:
  # ── Models ────────────────────────────────────────────────────────────
  /v1/models:
    get:
      tags: [Models]
      summary: List models
      operationId: listModels
      security: []
      description: |
        Returns the full model catalog. Each model includes pricing,
        context window, capability flags, and the full parameter
        schema. This is the canonical source of truth: the docs site,
        playground, and SDK type hints all read from here. This endpoint
        is public and returns only tenant-agnostic model metadata.
      responses:
        '200':
          description: The complete model catalog.
          content:
            application/json:
              schema:
                type: object
                properties:
                  object:
                    type: string
                    example: list
                  data:
                    type: array
                    items:
                      $ref: '#/components/schemas/Model'
              examples:
                default:
                  value:
                    object: list
                    data:
                      - id: deepseek-v4-pro
                        object: model
                        owned_by: deepseek
                        description: 1M-context flagship reasoning model. Strong math, coding, multi-step planning.
                        context_window: 1000000
                        type: text
                        input_modalities: [text]
                        output_modalities: [text]
                        logo: https://media.empiriolabs.ai/model-logos/deepseek.png
                        is_new: true
                      - id: wan2-7-image
                        object: model
                        owned_by: alibaba
                        description: 4K text-to-image with Image Set mode (up to 12 cohesive images), Thinking Mode, and 9-image edit input.
                        context_window: null
                        type: image
                        input_modalities: [text, image]
                        output_modalities: [image]
                        logo: https://media.empiriolabs.ai/model-logos/wan.png
                        is_new: true
  /v1/models/{model_id}:
    get:
      tags: [Models]
      summary: Retrieve a model
      operationId: retrieveModel
      security: []
      description: |
        Returns the full per-model schema: pricing rows, parameter
        table (name, type, default, min/max, allowed values, descriptions,
        conditional flags), capability flags, and modalities. Same
        data the docs site and the playground render from. This endpoint
        is public and returns only tenant-agnostic model metadata.
      parameters:
        - in: path
          name: model_id
          required: true
          schema:
            type: string
            example: wan2-7-image
          description: Canonical model slug.
      responses:
        '200':
          description: Full model schema with parameters.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ModelDetail'
              examples:
                default:
                  value:
                    id: wan2-7-image
                    object: model
                    owned_by: alibaba
                    description: 4K text-to-image with Image Set mode, Thinking Mode, 9-image edit input.
                    type: image
                    logo: https://media.empiriolabs.ai/model-logos/wan.png
                    pricing_rows:
                      - charge: Wan 2.7 Image
                        spec: per image
                        rate: $0.030
                      - charge: Wan 2.7 Image Pro
                        spec: per image
                        rate: $0.075
                    parameters:
                      - name: prompt
                        type: string
                        required: true
                        description: Text prompt. For Image Set mode, describe each image in sequence.
                      - name: model
                        type: enum
                        required: false
                        default: wan2.7-image-pro
                        allowed_values: [wan2.7-image-pro, wan2.7-image]
                        description: Pro adds 4K + Thinking Mode + higher quality.
                      - name: aspect_ratio
                        type: enum
                        required: false
                        allowed_values: ["16:9", "1:1", "9:16", "3:2", "2:3", "4:3", "3:4", "5:4", "4:5"]
                      - name: resolution
                        type: enum
                        required: false
                        default: 2K
                        allowed_values: [1K, 2K, 4K]
                      - name: enable_sequential
                        type: boolean
                        required: false
                        default: false
                        description: Image Set mode — generate up to 12 cohesive related images.
        '404':
          $ref: '#/components/responses/NotFound'

  # ── Chat ─────────────────────────────────────────────────────────────
  /v1/chat/completions:
    post:
      tags: [Chat Completions]
      summary: Create a chat completion
      operationId: createChatCompletion
      description: |
        OpenAI-compatible chat completions. Pass any chat-capable model
        from the catalog as `model`. Supports streaming (SSE), tool
        calling, vision input, audio input, JSON mode, structured
        output, and provider-specific reasoning controls.

        Streaming uses Server-Sent Events with `data: ...` lines and a
        final `data: [DONE]`.
      requestBody:
        required: true
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/ChatRequest'
            examples:
              streaming:
                summary: Streaming completion
                value:
                  model: deepseek-v4-pro
                  messages:
                    - role: user
                      content: Summarize CRDT consistency in 3 bullets.
                  stream: true
                  temperature: 0.7
              tool_calling:
                summary: Tool / function calling
                value:
                  model: mistral-small-4
                  messages:
                    - role: user
                      content: What is the weather in Paris?
                  tools:
                    - type: function
                      function:
                        name: get_weather
                        parameters:
                          type: object
                          properties:
                            city:
                              type: string
                          required: [city]
                  tool_choice: auto
              vision:
                summary: Vision input
                value:
                  model: qwen3-5-omni-plus
                  messages:
                    - role: user
                      content:
                        - type: text
                          text: What is in this image?
                        - type: image_url
                          image_url:
                            url: https://example.com/photo.jpg
              json_mode:
                summary: JSON mode
                value:
                  model: deepseek-v4-pro
                  messages:
                    - role: user
                      content: 'Extract person from: John Doe, 32, NYC'
                  response_format:
                    type: json_object
              perplexity_with_geo:
                summary: Perplexity with geo + date filter
                value:
                  model: perplexity-sonar-pro
                  messages:
                    - role: user
                      content: Recent FDA approvals
                  search_context_size: high
                  search_recency_filter: week
                  country: US
      responses:
        '200':
          description: Completion (or SSE stream when `stream=true`).
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ChatResponse'
            text/event-stream:
              schema:
                type: string
                example: |
                  data: {"id":"chatcmpl-...","choices":[{"delta":{"content":"Hello"}}]}
                  data: [DONE]
        '400':
          $ref: '#/components/responses/BadRequest'
        '401':
          $ref: '#/components/responses/Unauthorized'
        '402':
          $ref: '#/components/responses/InsufficientCredits'
        '429':
          $ref: '#/components/responses/RateLimited'

  # ── Responses ────────────────────────────────────────────────────────
  /v1/completions:
    post:
      tags: [Completions]
      summary: Create a completion
      operationId: createCompletion
      description: |
        OpenAI-compatible prompt completions for clients that send a raw
        `prompt` instead of chat `messages`. Only models that advertise
        `POST /v1/completions` in `supported_endpoints` accept this shape.
        Streaming uses Server-Sent Events and includes usage when the model
        service reports it.
      requestBody:
        required: true
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/CompletionRequest'
            examples:
              streaming:
                summary: Streaming prompt completion
                value:
                  model: qwen3-5-9b
                  prompt: Write one concise launch sentence.
                  max_tokens: 64
                  stream: true
      responses:
        '200':
          description: Completion object or SSE stream when `stream=true`.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/CompletionResponse'
            text/event-stream:
              schema:
                type: string
                example: |
                  data: {"id":"cmpl-...","choices":[{"text":"Hello"}]}
                  data: [DONE]
        '400':
          $ref: '#/components/responses/BadRequest'
        '401':
          $ref: '#/components/responses/Unauthorized'
        '402':
          $ref: '#/components/responses/InsufficientCredits'
        '429':
          $ref: '#/components/responses/RateLimited'

  /v1/responses:
    post:
      tags: [Responses]
      summary: Create a response
      operationId: createResponse
      description: OpenAI Responses-compatible endpoint for stateful conversations.
      requestBody:
        required: true
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/ResponseRequest'
      responses:
        '200':
          description: Response object.
          content:
            application/json:
              schema:
                type: object

  # ── Messages ─────────────────────────────────────────────────────────
  /v1/messages:
    post:
      tags: [Messages]
      summary: Create a message (Anthropic format)
      operationId: createMessage
      description: |
        Anthropic Messages-compatible endpoint. Drop-in for any
        Anthropic SDK client. The gateway translates Anthropic ↔ OpenAI
        internally so the same model works under either shape.
        `tool_use` / `tool_result` blocks round-trip cleanly.
      requestBody:
        required: true
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/MessageRequest'
            examples:
              basic:
                value:
                  model: deepseek-v4-pro
                  max_tokens: 1024
                  messages:
                    - role: user
                      content: Hi!
      responses:
        '200':
          description: Message response.
          content:
            application/json:
              schema:
                type: object

  # ── Embeddings ───────────────────────────────────────────────────────
  /v1/embeddings:
    post:
      tags: [Embeddings]
      summary: Create embeddings
      operationId: createEmbeddings
      description: OpenAI-compatible embeddings. Batch up to 2048 inputs per call.
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: object
              required: [model, input]
              properties:
                model:
                  type: string
                  example: text-embedding-v4
                input:
                  oneOf:
                    - type: string
                    - type: array
                      items:
                        type: string
                  example: ["Sentence one.", "Sentence two."]
                encoding_format:
                  type: string
                  enum: [float, base64]
                  default: float
      responses:
        '200':
          description: Embedding vectors.
          content:
            application/json:
              schema:
                type: object
                properties:
                  object:
                    type: string
                    example: list
                  data:
                    type: array
                    items:
                      type: object
                      properties:
                        object:
                          type: string
                          example: embedding
                        index:
                          type: integer
                        embedding:
                          type: array
                          items:
                            type: number
                  model:
                    type: string
                  usage:
                    $ref: '#/components/schemas/Usage'

  # ── Reranks ──────────────────────────────────────────────────────────
  /v1/reranks:
    post:
      tags: [Reranks]
      summary: Rerank documents
      operationId: createReranks
      description: |
        Sort candidate `documents` by semantic relevance to a `query`. Returns each
        document's original index plus a 0-1 relevance score (higher = more
        relevant). Score is relative to the request and not comparable across
        requests.

        Built for RAG and search refinement: feed in the top-N hits from your
        vector store / BM25 / hybrid retriever, and the model returns a tighter,
        better-ordered shortlist.
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: object
              required: [model, query, documents]
              properties:
                model:
                  type: string
                  example: qwen3-rerank
                query:
                  type: string
                  description: Query text. Max 4,000 tokens.
                  example: What is a rerank model?
                documents:
                  type: array
                  description: Candidate documents to sort. Max 500 items, each up to 4,000 tokens.
                  items:
                    type: string
                  example:
                    - "Rerank models sort candidate documents by relevance."
                    - "Quantum computing is a cutting-edge field of computer science."
                    - "Pre-trained language models advanced rerank models."
                top_n:
                  type: integer
                  description: Number of top-ranked documents to return. Defaults to all.
                  minimum: 1
                  maximum: 500
                  example: 2
                instruct:
                  type: string
                  description: |
                    Custom English instruction guiding the sort. Defaults to a Q&A
                    retrieval policy. Pass `"Retrieve semantically similar text."` for
                    similarity sorting.
                  example: Given a web search query, retrieve relevant passages that answer the query.
                return_documents:
                  type: boolean
                  description: When true, return the original document text alongside each result.
                  default: false
      responses:
        '200':
          description: Sorted relevance scores.
          content:
            application/json:
              schema:
                type: object
                properties:
                  output:
                    type: object
                    properties:
                      results:
                        type: array
                        items:
                          type: object
                          properties:
                            index:
                              type: integer
                              description: Original index of the document in the input array.
                            relevance_score:
                              type: number
                              format: double
                              description: 0-1 score; higher = more relevant. Not comparable across requests.
                            document:
                              type: object
                              description: Returned only when `return_documents` is true.
                              properties:
                                text:
                                  type: string
                  usage:
                    $ref: '#/components/schemas/Usage'
                  request_id:
                    type: string

  # ── Images ───────────────────────────────────────────────────────────
  /v1/images/generations:
    post:
      tags: [Images]
      summary: Generate an image
      operationId: createImage
      description: |
        Generate, edit, inpaint, or produce variations of images.
        Outputs are hosted CDN URLs (signed for 7 days) by default, or
        inline `b64_json` when `response_format` requests it.

        By default the request is asynchronous: the response is a
        `job_id` plus polling URL, and you poll `GET /v1/jobs/<job-id>`
        for the finished images. This is the right mode for long or
        batched generations.

        Pass `sync: true` to hold the request open instead and receive
        the finished OpenAI-compatible image response (`created` +
        `data[]`) directly, with no polling. This is what OpenAI SDKs
        and OpenAI-compatible tools expect from `images.generate`. If a
        synchronous generation runs past the wait window (about four
        minutes), the response falls back to the async `job_id`
        envelope and the job keeps running.

        Image-edit flows accept `image: ["https://..."]` with up to the
        model's documented limit (3 for `qwen-image-2-0`, 9 for
        `wan2-7-image`, 14 for `seedream-5-0-lite`). Image-set modes
        generate cohesive image series, see each model's page for the
        toggle.

        Pass `template: "<slug>"` to apply a catalog-backed image
        template. Browse image templates with `GET /v1/images/templates`
        or all generation templates with `GET /v1/templates`.
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: object
              properties:
                model:
                  type: string
                  example: wan2-7-image
                prompt:
                  type: string
                  example: A glass cathedral at sunset, dramatic lighting
                template:
                  type: string
                  description: |
                    Optional image template slug. When provided, the
                    canonical prompt blends with your `prompt`, the
                    recommended model is chosen if you don't pass one,
                    and default params merge for any key you didn't set.
                    See `GET /v1/images/templates`.
                  example: studio-product-shot
                aspect_ratio:
                  type: string
                  example: "16:9"
                resolution:
                  type: string
                  example: 4K
                num_images:
                  type: integer
                  minimum: 1
                  maximum: 4
                  default: 1
                image:
                  type: array
                  items:
                    type: string
                    format: uri
                  description: Reference image URLs for image-edit mode.
                response_format:
                  type: string
                  enum: [url, b64_json]
                  default: url
                sync:
                  type: boolean
                  default: false
                  description: |
                    Hold the request and return the finished image
                    response directly instead of the async `job_id`
                    envelope. Alias: `wait`. Falls back to the async
                    envelope if the generation outlasts the wait window.
            examples:
              text_to_image:
                summary: Text-to-image
                value:
                  model: wan2-7-image
                  prompt: A glass cathedral at sunset, dramatic lighting
                  aspect_ratio: "16:9"
                  resolution: 4K
                  thinking_mode: true
                  num_images: 4
              image_edit:
                summary: Image edit (input images)
                value:
                  model: qwen-image-2-0
                  prompt: Add a soft pink sunset to the sky
                  image:
                    - https://example.com/source.jpg
                  num_images: 2
              image_template:
                summary: Image template
                value:
                  template: background-swap
                  prompt: Place this product on a brushed steel studio plinth
                  image:
                    - https://example.com/product.jpg
              image_set:
                summary: Image set (cohesive series)
                value:
                  model: wan2-7-image
                  prompt: 'First image: morning light. Second image: midday. Third: dusk.'
                  enable_sequential: true
                  num_images_set: 3
      responses:
        '200':
          description: |
            Async (default): a `job_id` envelope to poll. With
            `sync: true`: the finished image response.
          content:
            application/json:
              schema:
                oneOf:
                  - $ref: '#/components/schemas/JobAccepted'
                  - $ref: '#/components/schemas/ImageResponse'

  # ── Videos ───────────────────────────────────────────────────────────
  /v1/videos/generations:
    post:
      tags: [Videos]
      summary: Generate a video (async)
      operationId: createVideo
      description: |
        Always async. Returns a `job_id` and polling URL immediately;
        poll `GET /v1/jobs/<job-id>` for the final video URL.

        Video jobs typically complete in 1–15 minutes; 4K-tier video
        can take 30+ minutes.

        ### Templates and Extend

        Two optional fields ride alongside the normal video gen payload:

        - `template: "<slug>"` applies a pre-curated creative effect
          (canonical prompt + recommended model + default params).
          Browse video templates with `GET /v1/videos/templates`
          or all generation templates with `GET /v1/templates`.
        - `extend_from: { job_id | video_url }` continues a prior
          video. Works on every supported video model; EmpirioLabs
          handles the model-specific wiring for you.
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: object
              properties:
                model:
                  type: string
                  example: seedance-2-0-pro
                prompt:
                  type: string
                template:
                  type: string
                  description: |
                    Optional template slug. When provided, the
                    canonical prompt blends with your `prompt`, the
                    recommended model is chosen if you don't pass
                    one, and default params merge for any key you
                    didn't supply. Validation 400s if the chosen
                    model is not in `template.supported_models`.
                    See `GET /v1/videos/templates`.
                  example: baseball-stadium
                extend_from:
                  type: object
                  description: |
                    Continue a prior video. Pass exactly one of
                    `job_id` (any prior async job from this account)
                    or `video_url` (an http(s) URL to a video).
                    Works with every supported video model;
                    EmpirioLabs handles the model-specific wiring
                    for you.
                  properties:
                    job_id:
                      type: string
                      example: job_01HV3KABCDE
                    video_url:
                      type: string
                      format: uri
                resolution:
                  type: string
                  example: 1080p
                aspect_ratio:
                  type: string
                  example: "16:9"
                duration:
                  type: integer
                  example: 8
                generate_audio:
                  type: boolean
                  default: true
                image:
                  type: string
                  format: uri
                  description: Reference image URL for image-to-video modes.
                image_url:
                  type: string
                  format: uri
                  description: Alias for `image`. Accepted on any image-to-video model.
            examples:
              t2v:
                summary: Text-to-video
                value:
                  model: seedance-2-0-pro
                  prompt: A cinematic dolly shot of a city street at dusk in the rain
                  resolution: 1080p
                  aspect_ratio: "16:9"
                  duration: 8
                  generate_audio: true
              i2v:
                summary: Image-to-video
                value:
                  model: pixverse-v5-6
                  mode: i2v
                  prompt: Slowly zoom out and pan right
                  image: https://example.com/source.jpg
                  resolution: 720p
                  duration: 5
                  audio: true
              template:
                summary: Creative-effect template (Stadium Cam)
                value:
                  template: baseball-stadium
                  image_url: https://example.com/me.jpg
              template_with_model:
                summary: Template with explicit model override
                value:
                  template: action-hero
                  model: seedance-2-0-pro
                  image_url: https://example.com/subject.jpg
              extend:
                summary: Continue a prior video (works on any model)
                value:
                  extend_from:
                    job_id: job_01HV3KABCDE
              extend_compose:
                summary: Extend + apply a new template
                value:
                  template: cinematic-walk
                  extend_from:
                    job_id: job_01HV3KABCDE
      responses:
        '202':
          description: Job created. Poll the jobs endpoint for the result.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/JobAccepted'
              examples:
                default:
                  value:
                    job_id: job_01HV3KABCDE
                    status: processing
                    poll_url: /v1/jobs/job_01HV3KABCDE
                    created_at: '2026-05-02T17:11:32Z'

  /v1/templates:
    get:
      tags: ["Generation Templates"]
      summary: List generation templates
      operationId: listGenerationTemplates
      description: |
        Returns the catalog of image and video generation templates.
        Each template carries a canonical prompt, recommended model,
        supported model list, default parameters, required input shape,
        modality, and optional preview assets.

        New templates land in production within a minute of being added
        to the catalog.
      parameters:
        - in: query
          name: category
          schema:
            type: string
          description: Filter to a single category.
        - in: query
          name: modality
          schema:
            type: string
            enum: [video, image]
          description: Filter by template modality.
        - in: query
          name: type
          schema:
            type: string
            enum: [video, image]
          description: Alias for `modality`.
        - in: query
          name: model
          schema:
            type: string
          description: Only return templates whose `supported_models` list contains this model slug.
        - in: query
          name: featured
          schema:
            type: boolean
          description: Set to true to filter to featured templates only.
      responses:
        '200':
          description: List of templates.
          content:
            application/json:
              schema:
                type: object
                properties:
                  object:
                    type: string
                    example: list
                  template_count:
                    type: integer
                  data:
                    type: array
                    items:
                      $ref: '#/components/schemas/GenerationTemplate'

  /v1/templates/{slug}:
    get:
      tags: ["Generation Templates"]
      summary: Get a generation template by slug
      operationId: getGenerationTemplate
      parameters:
        - in: path
          name: slug
          required: true
          schema:
            type: string
          example: studio-product-shot
      responses:
        '200':
          description: Template detail.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/GenerationTemplate'
        '404':
          description: Template not found.

  /v1/images/templates:
    get:
      tags: [Images]
      summary: List image templates
      operationId: listImageTemplates
      description: |
        Returns active templates for `/v1/images/generations`.
        The response shape matches `GET /v1/templates`, filtered to
        `modality: "image"` by default.
      parameters:
        - in: query
          name: category
          schema:
            type: string
          description: Filter to a single category.
        - in: query
          name: model
          schema:
            type: string
          description: Only return templates whose `supported_models` list contains this model slug.
        - in: query
          name: featured
          schema:
            type: boolean
          description: Set to true to filter to featured templates only.
      responses:
        '200':
          description: List of image templates.
          content:
            application/json:
              schema:
                type: object
                properties:
                  object:
                    type: string
                    example: list
                  template_count:
                    type: integer
                  data:
                    type: array
                    items:
                      $ref: '#/components/schemas/GenerationTemplate'

  /v1/images/templates/{slug}:
    get:
      tags: [Images]
      summary: Get an image template by slug
      operationId: getImageTemplate
      parameters:
        - in: path
          name: slug
          required: true
          schema:
            type: string
          example: background-swap
      responses:
        '200':
          description: Template detail.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/GenerationTemplate'
        '404':
          description: Template not found.

  /v1/videos/templates:
    get:
      tags: [Videos]
      summary: List creative-effect templates
      operationId: listVideoTemplates
      description: |
        Returns the catalog of video generation templates ("creative
        effects"). Each template carries a canonical motion/style
        prompt, a recommended model, the list of supported models,
        default parameters (aspect ratio, duration, etc.), and the
        required input shape (most templates need a reference image
        of your subject).

        New templates land in production within a minute of being
        added to the catalog.
      parameters:
        - in: query
          name: category
          schema:
            type: string
            enum: [viral, cinematic, motion, transform, social, extend]
          description: Filter to a single category.
        - in: query
          name: modality
          schema:
            type: string
            enum: [video, image]
          description: Filter by template modality. Defaults to `video` on this endpoint.
        - in: query
          name: model
          schema:
            type: string
          description: Only return templates whose `supported_models` list contains this model slug.
        - in: query
          name: featured
          schema:
            type: boolean
          description: Set to true to filter to featured templates only.
      responses:
        '200':
          description: List of templates.
          content:
            application/json:
              schema:
                type: object
                properties:
                  object:
                    type: string
                    example: list
                  template_count:
                    type: integer
                  data:
                    type: array
                    items:
                      $ref: '#/components/schemas/GenerationTemplate'

  /v1/videos/templates/{slug}:
    get:
      tags: [Videos]
      summary: Get a single creative-effect template by slug
      operationId: getVideoTemplate
      parameters:
        - in: path
          name: slug
          required: true
          schema:
            type: string
          example: baseball-stadium
      responses:
        '200':
          description: Template detail.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/GenerationTemplate'
        '404':
          description: Template not found.

  # -- 3D assets ----------------------------------------------------------
  /v1/3d/generations:
    post:
      tags: [3D Generation]
      summary: Generate a 3D asset (async)
      operationId: create3DGeneration
      description: |
        Image-to-3D generation. Returns a `job_id` and polling URL
        immediately; poll `GET /v1/jobs/<job-id>` for the final signed
        GLB asset URL.

        `trellis-2-4b` accepts an input image plus resolution, seed,
        sampler, texture, and mesh export controls. See the model page
        for the full parameter surface.
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: object
              required: [model]
              properties:
                model:
                  type: string
                  example: trellis-2-4b
                image:
                  oneOf:
                    - type: string
                      format: uri
                    - type: string
                      description: Base64 data URI or inline image bytes.
                  description: Reference image for image-to-3D generation.
                image_url:
                  type: string
                  format: uri
                  description: Alias for `image` when passing a public URL.
                resolution:
                  type: string
                  enum: ["512", "1024", "1536"]
                  default: "1024"
                seed:
                  type: integer
                  minimum: 0
                  maximum: 2147483647
                  default: 42
                texture_size:
                  type: string
                  enum: ["1024", "2048", "4096"]
                  default: "2048"
                decimation_target:
                  type: integer
                  minimum: 10000
                  maximum: 1000000
                  default: 500000
                remesh:
                  type: boolean
                  default: true
                max_num_tokens:
                  type: integer
                  minimum: 1024
                  maximum: 131072
                  default: 49152
                response_format:
                  type: string
                  enum: [url, b64_json]
                  default: url
            examples:
              image_to_3d:
                summary: Image-to-3D GLB
                value:
                  model: trellis-2-4b
                  image_url: https://example.com/product-photo.png
                  resolution: "1024"
                  texture_size: "2048"
                  decimation_target: 500000
                  seed: 42
      responses:
        '202':
          description: Job created. Poll the jobs endpoint for the result.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/JobAccepted'
              examples:
                default:
                  value:
                    job_id: job_01HV3KABCDE
                    status: processing
                    poll_url: /v1/jobs/job_01HV3KABCDE
                    created_at: '2026-05-07T16:20:00Z'

  # ── Audio (TTS / music) ──────────────────────────────────────────────
  /v1/audio/speech:
    post:
      tags: [Audio]
      summary: Generate audio (TTS, music, podcast)
      operationId: createSpeech
      description: |
        Text-to-speech, music generation, and multi-speaker podcast
        TTS share this endpoint. Returns a hosted URL by default; pass
        `response_format: "b64_json"` for inline audio bytes.
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: object
              required: [model]
              properties:
                model:
                  type: string
                  example: gemini-2-5-flash-tts
                input:
                  type: string
                  description: Script / lyrics. Use [S1] / [S2] tags for multi-speaker models.
                prompt:
                  type: string
                  description: Music generation models use `prompt` instead of `input`.
                voice:
                  type: string
                  example: Charon
                output_format:
                  type: string
                  example: WAV
                duration:
                  type: integer
                  description: Music generation only — output length in seconds.
            examples:
              tts_single:
                summary: Single-voice TTS (Gemini)
                value:
                  model: gemini-2-5-flash-tts
                  input: Hello from EmpirioLabs.
                  voice: Charon
                  output_format: WAV
              podcast:
                summary: Multi-speaker podcast (SoulX)
                value:
                  model: soulx-podcast
                  input: '[S1] Welcome to the show. [S2] Glad to be here. [S1] Lets dive in.'
                  voice_s1: arthur
                  voice_s2: lj
                  output_format: mp3
              music:
                summary: Music generation
                value:
                  model: stable-audio-2-5
                  prompt: Lo-fi hip hop, mellow piano, gentle vinyl crackle, 90 BPM
                  duration: 60
                  steps: 8
                  cfg_scale: 1
      responses:
        '200':
          description: Audio response (URL by default, or inline bytes).
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/AudioResponse'

  /v1/audio/speech:stream:
    post:
      tags: [Audio]
      summary: Stream audio (real-time TTS)
      operationId: createSpeechStream
      description: |
        Streaming variant of `/v1/audio/speech`. Returns Server-Sent
        Events as the model synthesizes — sub-130ms time-to-first-byte
        on Inworld TTS Mini, sub-250ms on Max. Use this for voice
        agents and other interactive playback paths.

        Event types:
          * `audio.chunk` — base64-encoded audio bytes with `seq`,
            `format`, `sample_rate`, and `elapsed_ms` fields. Concatenate
            client-side for progressive playback.
          * `audio.timestamps` — emitted when `timestamp_type` is
            `WORD` or `CHARACTER`, carrying per-token timing info.
          * `audio.done` — final event with the assembled hosted URL,
            total duration, and usage payload for billing.
          * `audio.error` — only on upstream errors; followed by `[DONE]`.
          * `[DONE]` — terminator (standard SSE).

        Currently supported on Inworld TTS Mini / Max. Other TTS models
        use the synchronous `/v1/audio/speech` endpoint. Cancelled
        streams are billed for the characters synthesized up to the
        cancellation point.
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: object
              required: [model, input]
              properties:
                model:
                  type: string
                  example: tts-1-5-mini
                input:
                  type: string
                  description: Text to synthesize. Max 3000 characters per request.
                voice:
                  type: string
                  example: Sarah
                voice_id:
                  type: string
                  description: Free-form voice ID — use to address the full 271-voice catalog.
                language:
                  type: string
                  example: en-US
                output_format:
                  type: string
                  enum: [WAV, MP3, OGG_OPUS, FLAC, PCM]
                  default: WAV
                sample_rate:
                  type: string
                  enum: ["8000", "16000", "22050", "24000", "32000", "44100", "48000"]
                  default: "24000"
                speed:
                  type: number
                  default: 1.0
                temperature:
                  type: number
                  default: 1.0
                timestamp_type:
                  type: string
                  enum: [NONE, WORD, CHARACTER]
                  default: NONE
            examples:
              streaming_tts:
                summary: Streaming TTS for a voice agent
                value:
                  model: tts-1-5-mini
                  input: Hello — this audio streams as it's synthesized.
                  voice: Sarah
                  output_format: PCM
                  sample_rate: "24000"
                  timestamp_type: WORD
      responses:
        '200':
          description: Server-Sent Events stream (text/event-stream).
          content:
            text/event-stream:
              schema:
                type: string
                example: |
                  data: {"type":"audio.chunk","seq":0,"audio_b64":"...","format":"pcm","sample_rate":24000,"elapsed_ms":118}

                  data: {"type":"audio.chunk","seq":1,"audio_b64":"...","format":"pcm","sample_rate":24000,"elapsed_ms":206}

                  data: {"type":"audio.timestamps","seq":2,"timestamp_info":{"words":[{"text":"Hello","start_ms":120,"end_ms":480}]}}

                  data: {"type":"audio.done","url":"https://media.empiriolabs.ai/...","duration_seconds":3.2,"usage":{"processed_characters":47,"cost_usd":0.0006}}

                  data: [DONE]

  /v1/voices:
    get:
      tags: [Audio]
      summary: List available TTS voices
      operationId: listVoices
      description: |
        Returns the catalog of voices supported by streaming-capable
        TTS workers (currently Inworld TTS Mini / Max — 271+ named
        voices across 15 languages, regional accents, gendered
        variants, and character voices). Use `voice_id` from the
        response to address any voice on `/v1/audio/speech` or
        `/v1/audio/speech:stream`.
      parameters:
        - in: query
          name: model
          required: false
          schema:
            type: string
          description: Filter to voices supported by a specific TTS model.
      responses:
        '200':
          description: Voice catalog.
          content:
            application/json:
              schema:
                type: object
                properties:
                  object:
                    type: string
                    example: list
                  data:
                    type: array
                    items:
                      type: object
                      properties:
                        voice_id:
                          type: string
                        name:
                          type: string
                        languages:
                          type: array
                          items: { type: string }
                        gender:
                          type: string
                        preview_url:
                          type: string

  /v1/audio/generations:
    post:
      tags: [Audio]
      summary: Generate music / sound effects
      operationId: createAudioGeneration
      description: |
        Music / podcast / sound-effect generation. Distinct from
        `/v1/audio/speech` (TTS) — this endpoint covers Stable Audio,
        GLM TTS, MOSS, and SoulX Podcast where the prompt-to-audio
        path is generative rather than spoken-word.

        Long-running generations (> 60s) return `202 Accepted` with a
        `job_id` and `poll_url` — poll `/v1/jobs/{job_id}` until
        complete.
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: object
              required: [model]
              properties:
                model:
                  type: string
                  example: stable-audio-2-5
                prompt:
                  type: string
                  description: Natural-language description of the audio to generate.
                input:
                  type: string
                  description: Script / lyrics for podcast / multi-speaker models.
                duration:
                  type: number
                  description: Output length in seconds.
                steps:
                  type: integer
                  description: Number of denoising steps. Higher = better quality but slower.
                cfg_scale:
                  type: number
                  description: Prompt adherence. Higher = closer to prompt but less creative.
                seed:
                  type: integer
                  description: Random seed for reproducibility.
                output_format:
                  type: string
                  enum: [mp3, wav, flac, ogg]
            examples:
              music:
                summary: Music generation (Stable Audio)
                value:
                  model: stable-audio-2-5
                  prompt: Lo-fi hip hop, mellow piano, gentle vinyl crackle, 90 BPM
                  duration: 60
                  steps: 8
                  cfg_scale: 1
              podcast:
                summary: Multi-speaker podcast (SoulX)
                value:
                  model: soulx-podcast
                  input: '[S1] Welcome to the show. [S2] Glad to be here.'
                  voice_s1: arthur
                  voice_s2: lj
                  output_format: mp3
      responses:
        '200':
          description: Audio response (URL by default, or inline bytes).
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/AudioResponse'
        '202':
          description: Accepted — long-running job. Poll `/v1/jobs/{job_id}` until complete.
          content:
            application/json:
              schema:
                type: object
                properties:
                  job_id: { type: string }
                  status: { type: string, example: processing }
                  poll_url: { type: string, example: /v1/jobs/job_abc123 }

  # ── Transcription ────────────────────────────────────────────────────
  /v1/audio/transcriptions:
    post:
      tags: [Transcription]
      summary: Transcribe an audio file
      operationId: createTranscription
      description: |
        Whisper / Deepgram / Parakeet. Accepts either a multipart
        `file` upload OR a JSON body with `file_url`. Long files
        (over 5 minutes) auto-route to the async job system — the
        response includes a `job_id` instead of inline text.
      requestBody:
        required: true
        content:
          multipart/form-data:
            schema:
              type: object
              required: [model, file]
              properties:
                model:
                  type: string
                  example: openai-whisper-1
                file:
                  type: string
                  format: binary
                response_format:
                  type: string
                  enum: [json, text, srt, verbose_json, vtt]
                  default: json
                timestamp_granularities:
                  type: string
                  example: word,segment
                language:
                  type: string
                  example: en
                translate:
                  type: boolean
                  default: false
          application/json:
            schema:
              type: object
              required: [model, file_url]
              properties:
                model:
                  type: string
                  example: deepgram-nova-3
                file_url:
                  type: string
                  format: uri
                  example: https://example.com/recording.wav
                diarize:
                  type: boolean
                  default: false
                smart_format:
                  type: boolean
                  default: true
      responses:
        '200':
          description: Transcript (or job_id for long files).
          content:
            application/json:
              schema:
                oneOf:
                  - type: object
                    properties:
                      text:
                        type: string
                      language:
                        type: string
                      duration:
                        type: number
                      segments:
                        type: array
                        items:
                          type: object
                  - $ref: '#/components/schemas/JobAccepted'

  /v1/search:
    post:
      tags: [Search]
      summary: Search and research
      operationId: createSearch
      description: |
        Unified surface for retrieval-style models: Exa, Tavily,
        Linkup, Perplexity Search. Each model has a different parameter
        surface — see the per-model docs for the full schema (e.g.
        `exa-search` exposes 28 params including `category`, `livecrawl`,
        `subpages`, `summary_query`, `code_tokens`).
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: object
              required: [model, query]
              properties:
                model:
                  type: string
                  example: tavily-search
                query:
                  type: string
            examples:
              tavily_advanced:
                summary: Tavily advanced search
                value:
                  model: tavily-search
                  query: latest CRDT research papers 2026
                  search_depth: advanced
                  include_answer: advanced
                  max_results: 10
                  topic: general
              exa_neural:
                summary: Exa neural search with subpages
                value:
                  model: exa-search
                  query: alternatives to vector databases
                  search_type: neural
                  category: research paper
                  subpages: 3
                  summary: true
              linkup_with_domains:
                summary: Linkup with domain filter
                value:
                  model: linkup-deep-search
                  query: LLM benchmarks
                  output_type: sourcedAnswer
                  domain_filter_mode: Include
                  include_domains: arxiv.org, openreview.net
                  enable_inline_citations: true
      responses:
        '200':
          description: Sources / answer / structured data depending on model.
          content:
            application/json:
              schema:
                type: object

  /v1/research:
    post:
      tags: [Search]
      summary: Long-running research with citations
      operationId: createResearch
      description: |
        Deep research models: Tavily Research, Exa Research, Perplexity
        Deep Research, Perplexity Advanced Deep Research. The model
        plans search queries, fetches pages, synthesizes a multi-section
        report with inline citations. Typical runtime 30s-15min depending
        on `reasoning_effort`.

        Returns `202 Accepted` with a `job_id` and `poll_url` — poll
        `/v1/jobs/{job_id}` until complete. For streaming progress
        (`*Thinking...*` blockquote of search steps + sources + answer),
        call `POST /v1/chat/completions` with `stream: true` on the same
        model instead — the gateway routes both surfaces to the same
        worker.
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: object
              required: [model]
              properties:
                model:
                  type: string
                  example: perplexity-advanced-deep-research
                query:
                  type: string
                  description: The research question.
                messages:
                  type: array
                  description: Alternative to `query` — pass a chat-style messages array for multi-turn research.
                  items:
                    type: object
                reasoning_effort:
                  type: string
                  enum: [low, medium, high]
                  default: high
                max_output_tokens:
                  type: integer
                  default: 10000
                search_domain_filter:
                  type: string
                  description: Comma-separated domains. Prefix with '-' to exclude. Max 20.
                search_recency_filter:
                  type: string
                  enum: [hour, day, week, month, year]
                disable_formatting:
                  type: boolean
                  default: false
                  description: Skip EmpirioLabs Markdown formatting and return the raw upstream payload.
            examples:
              perplexity_adv:
                summary: Perplexity Advanced Deep Research
                value:
                  model: perplexity-advanced-deep-research
                  query: State of humanoid-robot startup funding in 2025
                  reasoning_effort: high
                  max_output_tokens: 14000
              tavily:
                summary: Tavily Research
                value:
                  model: tavily-research
                  query: latest US economic indicators and Fed sentiment
                  citation_format: footnote
      responses:
        '200':
          description: Research report (when run synchronously and short enough).
          content:
            application/json:
              schema:
                type: object
        '202':
          description: Accepted — long-running job. Poll `/v1/jobs/{job_id}` until complete.
          content:
            application/json:
              schema:
                type: object
                properties:
                  job_id: { type: string }
                  status: { type: string, example: processing }
                  poll_url: { type: string, example: /v1/jobs/job_abc123 }

  /v1/answer:
    post:
      tags: [Search]
      summary: One-shot grounded answer
      operationId: createAnswer
      description: |
        Single-call grounded answer endpoint. Currently powers Exa Answer
        and similar models that return a synthesized answer with inline
        citations in one round-trip (no streaming, no separate fetch /
        crawl steps). For multi-step research use `/v1/research`; for
        raw retrieval use `/v1/search`.
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: object
              required: [model, query]
              properties:
                model:
                  type: string
                  example: exa-answer
                query:
                  type: string
                stream:
                  type: boolean
                  default: false
                  description: Stream the answer text as it's generated (SSE).
            examples:
              exa_answer:
                summary: Exa Answer
                value:
                  model: exa-answer
                  query: When was the last total solar eclipse visible from the US mainland?
      responses:
        '200':
          description: Synthesized answer with citations.
          content:
            application/json:
              schema:
                type: object
                properties:
                  answer: { type: string }
                  citations:
                    type: array
                    items:
                      type: object
                      properties:
                        title: { type: string }
                        url: { type: string }
                        snippet: { type: string }
                  usage:
                    $ref: '#/components/schemas/Usage'

  # ── Agents (autonomous task runners) ─────────────────────────────────
  /v1/agents/run:
    post:
      tags: [Agents]
      summary: Start an agent task or send a follow-up message
      operationId: runAgent
      description: |
        Submits work to an autonomous, long-running agent (file browsing,
        research, code execution, document drafting). Currently powers
        Manus.

        One endpoint covers both creating a new task and continuing an
        existing one:

          - Omit `task_id` to start a fresh task. The response carries
            the new `task_id` you can poll later.
          - Include `task_id` to send a follow-up message to a task
            that is already running. The agent picks up the new input
            on its next reasoning step.

        Lifecycle for a typical interactive flow:
          1. `POST /v1/agents/run` with `input` → `{ task_id }`.
          2. Poll `GET /v1/agents/{task_id}` until `status` reaches a
             terminal state (`completed`, `failed`, `stopped`), or
             stream the response by passing `stream: true` on the
             initial call.
          3. To send a follow-up message: `POST /v1/agents/run` again
             with the same `task_id` plus the new `input`.
          4. To inspect intermediate reasoning: `GET /v1/agents/{task_id}/messages`.
          5. To cancel early: `POST /v1/agents/{task_id}/stop`.

        Billing accumulates as the agent runs. Partial work is still
        charged on stop or failure.
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: object
              required: [model, input]
              properties:
                model:
                  type: string
                  example: manus
                input:
                  type: string
                  description: Instruction for the agent. For a new task this is the initial prompt; for a follow-up (when `task_id` is set) this is the new message added mid-run.
                task_id:
                  type: string
                  description: Optional. When set, the call sends `input` as a follow-up message to this existing task instead of creating a new one.
                stream:
                  type: boolean
                  default: true
                  description: Stream Server-Sent Events as the agent works. When false, the call blocks until the task reaches a terminal state or `max_wait_seconds` elapses.
                agent_profile:
                  type: string
                  default: manus-1.6
                  description: Upstream agent profile. Available profiles depend on the model (e.g. Manus exposes `manus-1.6`).
                poll_interval_seconds:
                  type: number
                  default: 3
                  minimum: 1
                  maximum: 30
                  description: How often the worker polls upstream while waiting on the task. Only relevant when `stream` is false.
                max_wait_seconds:
                  type: number
                  default: 900
                  minimum: 5
                  maximum: 1800
                  description: Maximum time the worker will block on a non-streaming call before returning the in-progress task status.
                share_visibility:
                  type: string
                  default: public
                  description: Manus share-link visibility (`public` or `private`).
                disable_formatting:
                  type: boolean
                  default: false
                  description: Skip the EmpirioLabs Markdown wrapper around the agent run. Aliases `raw`, `passthrough`, `raw_response`.
            examples:
              start_task:
                summary: Start a new Manus task
                value:
                  model: manus
                  input: Research the top 3 humanoid-robot startups by 2025 funding and write a 500-word memo.
              follow_up:
                summary: Send a follow-up message to an existing task
                value:
                  model: manus
                  task_id: task_01HV3K...
                  input: Also include each startup's most recent valuation and lead investor.
      responses:
        '202':
          description: Task accepted. Poll `/v1/agents/{task_id}` for status.
          content:
            application/json:
              schema:
                type: object
                properties:
                  task_id: { type: string }
                  status: { type: string, example: queued }
                  poll_url: { type: string, example: /v1/agents/task_abc123 }

  /v1/agents/{task_id}:
    get:
      tags: [Agents]
      summary: Get agent task status + output
      operationId: getAgentTask
      parameters:
        - in: path
          name: task_id
          required: true
          schema: { type: string }
      responses:
        '200':
          description: Current task state, accumulated output, and tool calls so far.
          content:
            application/json:
              schema:
                type: object
                properties:
                  task_id: { type: string }
                  status:
                    type: string
                    enum: [queued, running, completed, failed, stopped]
                  output: { type: string }
                  artifacts:
                    type: array
                    items:
                      type: object
                      properties:
                        type: { type: string }
                        url: { type: string }
                  usage:
                    $ref: '#/components/schemas/Usage'

  /v1/agents/{task_id}/messages:
    get:
      tags: [Agents]
      summary: List agent task messages
      operationId: listAgentMessages
      description: |
        Return every step-by-step message the agent has emitted while
        working on this task. Combine with `GET /v1/agents/{task_id}`
        for the final result. Polled by clients that want to render a
        live reasoning trace.
      parameters:
        - in: path
          name: task_id
          required: true
          schema: { type: string }
      responses:
        '200':
          description: Messages collected so far.
          content:
            application/json:
              schema:
                type: object
                properties:
                  object: { type: string, example: agent.messages }
                  model: { type: string, example: manus }
                  task_id: { type: string }
                  result:
                    type: object
                    description: Upstream agent step list (raw + extracted).

  /v1/agents/{task_id}/stop:
    post:
      tags: [Agents]
      summary: Stop a running agent task
      operationId: stopAgentTask
      parameters:
        - in: path
          name: task_id
          required: true
          schema: { type: string }
      responses:
        '200':
          description: Task stopped. Final state and partial output included in the response.
          content:
            application/json:
              schema:
                type: object
                properties:
                  task_id: { type: string }
                  status: { type: string, example: stopped }
                  output: { type: string }
                  usage:
                    $ref: '#/components/schemas/Usage'

  /v1/images/analysis:
    post:
      tags: [Images]
      summary: Analyze / describe an image
      operationId: analyzeImage
      description: |
        Vision-only endpoint for image-understanding models (Janus Pro,
        Amazon Nova Canvas variation/describe modes). For chat-style
        vision (Qwen-VL, Mistral Vision, etc.) use
        `POST /v1/chat/completions` with an `image_url` content part
        instead.
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: object
              required: [model, image_url, question]
              properties:
                model:
                  type: string
                  example: janus-pro-deepseek
                image_url:
                  type: string
                  description: URL or base64 of the image to analyze.
                question:
                  type: string
                  description: What you want to know about the image.
                analysis_temperature:
                  type: number
                analysis_top_p:
                  type: number
                analysis_seed:
                  type: integer
      responses:
        '200':
          description: Analysis result.
          content:
            application/json:
              schema:
                type: object
                properties:
                  answer: { type: string }
                  usage:
                    $ref: '#/components/schemas/Usage'

  # ── Detection (model-specific) ───────────────────────────────────────
  /v1/detect:
    post:
      tags: [Detection]
      summary: Text classification / detection
      operationId: createDetection
      description: |
        Specialized endpoint for classification-style models that don't
        fit chat / search / image. Currently powers GPTZero (AI text
        detection, bibliography scan, source analysis). Each model's
        `scan_type` enum controls the detection mode.
        see the per-model docs for the full parameter surface.

        Note: GPTZero is also reachable via `POST /v1/chat/completions`
        and `POST /v1/responses`: pass the text on the message body
        and the gateway adapts the call.
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: object
              required: [model, input]
              properties:
                model:
                  type: string
                  example: gptzero
                input:
                  type: string
                  description: Text or document to analyze (max 50,000 chars per document, 50 docs / 15 MB total per request).
                scan_type:
                  type: string
                  enum: [ai_detection, bibliography, fact_check]
                  default: ai_detection
                multilingual:
                  type: boolean
                  default: false
                  description: Enable multilingual model (FR / ES on top of EN).
                version:
                  type: string
                  default: __latest__
            examples:
              gptzero_ai_detection:
                summary: GPTZero AI-detection
                value:
                  model: gptzero
                  input: "The quick brown fox jumps over the lazy dog."
                  scan_type: ai_detection
                  multilingual: false
              gptzero_bibliography:
                summary: GPTZero bibliography scan
                value:
                  model: gptzero
                  scan_type: bibliography
                  input: "[1] Smith, J. (2024). Title. Journal, 12(3), 45-67."
      responses:
        '200':
          description: Detection result with per-sentence breakdown and overall scores.
          content:
            application/json:
              schema:
                type: object

  # ── Files (no public surface) ────────────────────────────────────────
  # EmpirioLabs does not expose any public file endpoints:
  #   • /v1/files/upload was removed (cost/abuse vector).
  #   • There is no re-sign endpoint — generated output URLs expire 7 days
  #     after creation and cannot be extended on the user side.
  # Inputs: pass any public URL on the input field of the model endpoint
  # (image, file_url, video). For private audio, use multipart-direct on
  # /v1/audio/transcriptions.

  # ── Account ──────────────────────────────────────────────────────────
  /v1/account/usage:
    get:
      tags: [Account]
      summary: Retrieve account usage and balance
      operationId: retrieveAccountUsage
      description: |
        Returns the current credit balance, a per-product spend breakdown
        (model and API usage, GPU Cloud, and Hosted Agents), plus recent
        usage events for the account attached to the API key. Each event has a
        `source` (`api`, `playground`, `gpu_cloud`, or `hosted_agents`) plus
        usage counters, request costs, status, model, endpoint, latency, request
        ID, and generated media URLs. Token-billed rows report token counts.
        GPU Cloud runtime rows keep the stable `tokens` object with zero counts
        and put runtime details such as `seconds`, `price_hourly`, and
        `gpu_display` in `metadata`. For an event made by a Hosted Agent, the
        `metadata` object also identifies the agent (`agent_type`,
        `agent_name`, `agent_instance_id`). The `spend` totals honor the `from`
        and `to` window, or are all-time when neither is set.
      parameters:
        - in: query
          name: limit
          schema:
            type: integer
            minimum: 1
            maximum: 200
            default: 100
          description: Number of usage events to return.
        - in: query
          name: before
          schema:
            type: string
            format: date-time
          description: Return events created before this timestamp. Use `next_cursor` for pagination.
        - in: query
          name: from
          schema:
            type: string
            format: date-time
          description: Include events created at or after this timestamp.
        - in: query
          name: to
          schema:
            type: string
            format: date-time
          description: Include events created at or before this timestamp.
        - in: query
          name: source
          schema:
            type: string
            enum: [api, playground, gpu_cloud, hosted_agents]
          description: >-
            Filter events by source: `api` (direct API requests), `playground`,
            `gpu_cloud` (GPU Cloud rentals), or `hosted_agents` (calls made by a
            Hosted Agent).
        - in: query
          name: model
          schema:
            type: string
          description: Filter to one model ID.
        - in: query
          name: status
          schema:
            type: string
            enum: [success, error]
          description: Filter by request status.
      responses:
        '200':
          description: Account balance, summary, and usage events.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/AccountUsageResponse'
              examples:
                default:
                  value:
                    object: account_usage
                    balance:
                      amount: 42.5
                      currency: USD
                      auto_topup_enabled: true
                      auto_topup_threshold: 10
                      auto_topup_amount: 50
                      updated_at: '2026-05-06T00:00:00Z'
                    summary:
                      requests: 3
                      total_cost: 0.249386
                      total_tokens: 5740
                      input_tokens: 4300
                      output_tokens: 1440
                      cache_read_tokens: 0
                      cache_write_tokens: 0
                      errors: 0
                      avg_latency_ms: 1840
                    spend:
                      currency: USD
                      total: 0.249386
                      model: 0.0124
                      gpu_cloud: 0.228086
                      hosted_agents: 0.0089
                      from: null
                      to: null
                    data:
                      - id: '19231'
                        object: usage_event
                        created_at: '2026-05-06T00:12:03Z'
                        source: playground
                        model: qwen3-max
                        endpoint: /playground/chat
                        status: success
                        status_code: 200
                        error: null
                        request_id: req_abc123
                        latency_ms: 1840
                        tokens:
                          input: 3100
                          output: 900
                          cache_read: 0
                          cache_write: 0
                          total: 4000
                        cost:
                          amount: 0.0124
                          currency: USD
                        output_urls: []
                        metadata:
                          source: playground
                      - id: '19232'
                        object: usage_event
                        created_at: '2026-05-06T00:11:40Z'
                        source: hosted_agents
                        model: qwen3-max
                        endpoint: /v1/chat/completions
                        status: success
                        status_code: 200
                        error: null
                        request_id: req_def456
                        latency_ms: 2210
                        tokens:
                          input: 1200
                          output: 540
                          cache_read: 0
                          cache_write: 0
                          total: 1740
                        cost:
                          amount: 0.0089
                          currency: USD
                        output_urls: []
                        metadata:
                          category: agents
                          agent_type: hermes
                          agent_name: My Assistant
                          agent_instance_id: 8bf95e48-0815-4eee-8dec-141956729c36
                      - id: '19233'
                        object: usage_event
                        created_at: '2026-05-06T00:10:12Z'
                        source: gpu_cloud
                        model: gpu-cloud:h100-nvl
                        endpoint: /v1/gpu/connect/gpu_inst_123
                        status: success
                        status_code: 200
                        error: null
                        request_id: null
                        latency_ms: null
                        tokens:
                          input: 0
                          output: 0
                          cache_read: 0
                          cache_write: 0
                          total: 0
                        cost:
                          amount: 0.228086
                          currency: USD
                        output_urls: []
                        metadata:
                          category: gpu_cloud
                          gpu_slug: h100-nvl
                          gpu_display: H100 NVL
                          num_gpus: 1
                          seconds: 147
                          price_hourly: 5.6
                    has_more: false
                    next_cursor: null
        '401':
          $ref: '#/components/responses/Unauthorized'

  # ── Playground ───────────────────────────────────────────────────────
  /v1/playground/conversations:
    get:
      tags: [Playground]
      summary: List saved Playground conversations
      operationId: listPlaygroundConversations
      description: |
        Lists saved Playground conversations for the account attached to
        the API key. This endpoint is read-only; save and delete chats in
        the dashboard Playground.
      parameters:
        - in: query
          name: limit
          schema:
            type: integer
            minimum: 1
            maximum: 100
            default: 50
        - in: query
          name: before
          schema:
            type: string
            format: date-time
          description: Return conversations updated before this timestamp.
        - in: query
          name: model
          schema:
            type: string
          description: Filter to saved chats for one model ID.
      responses:
        '200':
          description: Saved Playground conversations.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/PlaygroundConversationList'
        '401':
          $ref: '#/components/responses/Unauthorized'

  /v1/playground/conversations/{conversation_id}:
    get:
      tags: [Playground]
      summary: Retrieve a saved Playground conversation
      operationId: retrievePlaygroundConversation
      description: Returns one saved Playground conversation, including messages.
      parameters:
        - in: path
          name: conversation_id
          required: true
          schema:
            type: string
      responses:
        '200':
          description: Saved Playground conversation with messages.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/PlaygroundConversation'
        '401':
          $ref: '#/components/responses/Unauthorized'
        '404':
          $ref: '#/components/responses/NotFound'

  # ── Jobs ─────────────────────────────────────────────────────────────
  /v1/jobs:
    get:
      tags: [Jobs]
      summary: List recent jobs
      operationId: listJobs
      description: List your last 100 async jobs (rolling 1-hour window).
      parameters:
        - in: query
          name: status
          required: false
          schema:
            type: string
            enum: [processing, completed, failed]
        - in: query
          name: limit
          required: false
          schema:
            type: integer
            default: 25
            maximum: 100
      responses:
        '200':
          description: Job list.
          content:
            application/json:
              schema:
                type: object
                properties:
                  data:
                    type: array
                    items:
                      $ref: '#/components/schemas/Job'

  /v1/jobs/{job_id}:
    get:
      tags: [Jobs]
      summary: Retrieve a job
      operationId: retrieveJob
      description: Poll a job's status and final result.
      parameters:
        - in: path
          name: job_id
          required: true
          schema:
            type: string
            example: job_01HV3KABCDE
      responses:
        '200':
          description: Job state.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/Job'
              examples:
                processing:
                  value:
                    job_id: job_01HV3KABCDE
                    status: processing
                    progress: 0.42
                    created_at: '2026-05-02T17:11:32Z'
                    completed_at: null
                    result: null
                    error: null
                completed:
                  value:
                    job_id: job_01HV3KABCDE
                    status: completed
                    progress: 1.0
                    created_at: '2026-05-02T17:11:32Z'
                    completed_at: '2026-05-02T17:18:04Z'
                    result:
                      data:
                        - url: https://media.empiriolabs.ai/worker-outputs/abc.../video.mp4
                    error: null
        '404':
          $ref: '#/components/responses/NotFound'

    delete:
      tags: [Jobs]
      summary: Cancel a job
      operationId: cancelJob
      description: |
        Best-effort cancel — the upstream worker may have already
        finished. If the job is already `completed` or `failed`, this
        is a no-op.
      parameters:
        - in: path
          name: job_id
          required: true
          schema:
            type: string
      responses:
        '200':
          description: Cancellation acknowledged.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/Job'

  # ── GPU Cloud ─────────────────────────────────────────────────────────────
  /v1/gpu/catalog:
    get:
      tags: [GPU Cloud]
      summary: List GPU types
      operationId: listGpuTypes
      description: |
        Returns the GPU Cloud catalog with hourly pricing, VRAM, region,
        and the current available count. Prices and availability update as
        capacity changes.
      security: []
      responses:
        '200':
          description: List of GPU types.
          content:
            application/json:
              schema:
                type: object
                properties:
                  object: { type: string, example: list }
                  data:
                    type: array
                    items:
                      $ref: '#/components/schemas/GpuType'

  /v1/gpu/catalog/{slug}:
    get:
      tags: [GPU Cloud]
      summary: Get a GPU type
      operationId: getGpuType
      security: []
      parameters:
        - in: path
          name: slug
          required: true
          schema: { type: string }
          example: rtx-4090
      responses:
        '200':
          description: GPU type detail.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/GpuType'
        '404':
          description: GPU type not found.

  /v1/gpu/instances:
    get:
      tags: [GPU Cloud]
      summary: List your GPU instances
      operationId: listGpuInstances
      responses:
        '200':
          description: Your active GPU Cloud instances.
          content:
            application/json:
              schema:
                type: object
                properties:
                  instances:
                    type: array
                    items:
                      $ref: '#/components/schemas/GpuInstance'
    post:
      tags: [GPU Cloud]
      summary: Deploy a GPU
      operationId: deployGpuInstance
      description: |
        Start a GPU Cloud instance. Choose a curated model, paste any Hugging Face
        repo id (served with vLLM, OpenAI-compatible at `/v1`), pick a
        template (JupyterLab, ComfyUI, Web Terminal, Ollama), or run a custom
        Docker image. Billing starts when the GPU reaches `running` and is
        metered by the second against your credit balance. Your account's
        current GPU limit is enforced at deploy and start time.
      requestBody:
        required: true
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/GpuDeployRequest'
      responses:
        '200':
          description: Instance accepted and provisioning.
          content:
            application/json:
              schema:
                type: object
                properties:
                  instance:
                    $ref: '#/components/schemas/GpuInstance'
        '402':
          description: Add credits before starting a GPU.
        '409':
          description: GPU capacity or account GPU limit is not currently available.
        '422':
          description: The selected GPU, disk size, or workload settings are not valid.
        '404':
          description: GPU type or instance route not found.

  /v1/gpu/instances/{instance_id}:
    get:
      tags: [GPU Cloud]
      summary: Get a GPU instance
      operationId: getGpuInstance
      parameters:
        - in: path
          name: instance_id
          required: true
          schema: { type: string }
      responses:
        '200':
          description: Instance detail.
          content:
            application/json:
              schema:
                type: object
                properties:
                  instance:
                    $ref: '#/components/schemas/GpuInstance'
        '404':
          description: Instance not found.
    delete:
      tags: [GPU Cloud]
      summary: Destroy a GPU instance
      operationId: destroyGpuInstance
      description: Permanently destroy the instance and stop billing. This cannot be undone.
      parameters:
        - in: path
          name: instance_id
          required: true
          schema: { type: string }
      responses:
        '200':
          description: Instance destroyed.

  /v1/gpu/instances/{instance_id}/{action}:
    post:
      tags: [GPU Cloud]
      summary: Stop, start, or refresh a GPU instance
      operationId: actionGpuInstance
      description: |
        `stop` releases the running allocation and pauses billing.
        `start` redeploys the saved instance spec with fresh runtime disk. `refresh` re-syncs live status.
      parameters:
        - in: path
          name: instance_id
          required: true
          schema: { type: string }
        - in: path
          name: action
          required: true
          schema:
            type: string
            enum: [stop, start, refresh]
      responses:
        '200':
          description: Action applied.
          content:
            application/json:
              schema:
                type: object
                properties:
                  instance:
                    $ref: '#/components/schemas/GpuInstance'

  /v1/gpu/connect/{instance_id}/{path}:
    get:
      tags: [GPU Cloud]
      summary: Connect to a running GPU
      operationId: connectGpuInstance
      description: |
        Call the service running on your GPU Cloud instance through an authenticated
        EmpirioLabs API path. For a model deploy this is the
        OpenAI-compatible base: send `POST /v1/gpu/connect/{instance_id}/v1/chat/completions`.
        For a JupyterLab, ComfyUI, Web Terminal, or Ollama template, call the
        service through the same connect path.
        Supports GET, POST, PUT, PATCH, DELETE, and streaming responses.
      parameters:
        - in: path
          name: instance_id
          required: true
          schema: { type: string }
        - in: path
          name: path
          required: true
          schema: { type: string }
          description: The path on your GPU service to reach, e.g. `v1/chat/completions`.
          example: v1/chat/completions
      responses:
        '200':
          description: Response from your GPU service.
        '502':
          description: The GPU instance is not reachable right now.

  # Hosted Agents
  /v1/hosted-agents/health:
    get:
      tags: [Hosted Agents]
      summary: Hosted agents health
      operationId: hostedAgentsHealth
      security: []
      responses:
        '200':
          description: Hosted agents health response.

  /v1/hosted-agents/config:
    get:
      tags: [Hosted Agents]
      summary: Get hosted-agent plans and runtime presets
      operationId: getHostedAgentsConfig
      responses:
        '200':
          description: Plans, runtime presets, account limits, balance, and Poe setup metadata.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/HostedAgentsConfig'

  /v1/hosted-agents/instances:
    get:
      tags: [Hosted Agents]
      summary: List hosted agents
      operationId: listHostedAgentInstances
      responses:
        '200':
          description: Your hosted agent instances.
          content:
            application/json:
              schema:
                type: object
                properties:
                  instances:
                    type: array
                    items:
                      $ref: '#/components/schemas/HostedAgentInstance'
    post:
      tags: [Hosted Agents]
      summary: Create a hosted agent
      operationId: createHostedAgentInstance
      description: |
        Create an isolated OpenClaw or Hermes Agent runtime. The selected
        monthly plan is charged from your account credit balance when the
        runtime is accepted. Your account's current hosted-agent limit is
        enforced when the runtime is accepted.
      requestBody:
        required: true
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/HostedAgentCreateRequest'
      responses:
        '201':
          description: Hosted agent created.
          content:
            application/json:
              schema:
                type: object
                properties:
                  ok: { type: boolean }
                  instance:
                    $ref: '#/components/schemas/HostedAgentInstance'
        '402':
          description: Add credits before creating the hosted agent.
        '403':
          description: Hosted-agent account limit reached.

  /v1/hosted-agents/instances/{instance_id}:
    get:
      tags: [Hosted Agents]
      summary: Get a hosted agent
      operationId: getHostedAgentInstance
      parameters:
        - in: path
          name: instance_id
          required: true
          schema: { type: string }
      responses:
        '200':
          description: Hosted agent detail.
          content:
            application/json:
              schema:
                type: object
                properties:
                  instance:
                    $ref: '#/components/schemas/HostedAgentInstance'
        '404':
          description: Hosted agent not found.
    delete:
      tags: [Hosted Agents]
      summary: Destroy a hosted agent
      operationId: destroyHostedAgentInstance
      parameters:
        - in: path
          name: instance_id
          required: true
          schema: { type: string }
      responses:
        '200':
          description: Hosted agent destroyed.
    patch:
      tags: [Hosted Agents]
      summary: Rename a hosted agent
      operationId: renameHostedAgentInstance
      parameters:
        - in: path
          name: instance_id
          required: true
          schema: { type: string }
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: object
              required: [name]
              properties:
                name: { type: string }
      responses:
        '200':
          description: Hosted agent renamed.

  /v1/hosted-agents/instances/{instance_id}/{action}:
    post:
      tags: [Hosted Agents]
      summary: Start, stop, restart, or refresh a hosted agent
      operationId: actionHostedAgentInstance
      parameters:
        - in: path
          name: instance_id
          required: true
          schema: { type: string }
        - in: path
          name: action
          required: true
          schema:
            type: string
            enum: [start, stop, restart, refresh]
      responses:
        '200':
          description: Action applied.
          content:
            application/json:
              schema:
                type: object
                properties:
                  instance:
                    $ref: '#/components/schemas/HostedAgentInstance'

  /v1/hosted-agents/instances/{instance_id}/message:
    post:
      tags: [Hosted Agents]
      summary: Send a message to a hosted agent
      operationId: messageHostedAgentInstance
      parameters:
        - in: path
          name: instance_id
          required: true
          schema: { type: string }
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: object
              required: [messages]
              properties:
                messages:
                  type: array
                  items:
                    type: object
      responses:
        '200':
          description: Hosted agent response.
          content:
            application/json:
              schema:
                type: object
                properties:
                  request_id: { type: string }
                  status: { type: string }
                  content: { type: string }

  /v1/hosted-agents/instances/{instance_id}/integrations:
    get:
      tags: [Hosted Agents]
      summary: List hosted-agent integrations
      operationId: listHostedAgentIntegrations
      parameters:
        - in: path
          name: instance_id
          required: true
          schema: { type: string }
      responses:
        '200':
          description: Integration list.
    post:
      tags: [Hosted Agents]
      summary: Save a hosted-agent integration
      operationId: saveHostedAgentIntegration
      description: Saves a token-based integration. Telegram, Discord, Slack, and Matrix work on OpenClaw and Hermes agents. Mattermost and Microsoft Teams are OpenClaw-only. WhatsApp is paired from the dashboard with a QR code instead of this endpoint.
      parameters:
        - in: path
          name: instance_id
          required: true
          schema: { type: string }
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: object
              required: [kind]
              properties:
                kind:
                  type: string
                  enum: [poe_api_key, telegram, discord, slack, matrix, mattermost, msteams]
                secret:
                  type: string
                  description: Integration secret, API key, token, or webhook credential.
                config:
                  type: object
      responses:
        '201':
          description: Integration saved.

  /v1/hosted-agents/instances/{instance_id}/integrations/{kind}:
    delete:
      tags: [Hosted Agents]
      summary: Disconnect a hosted-agent integration
      operationId: disconnectHostedAgentIntegration
      parameters:
        - in: path
          name: instance_id
          required: true
          schema: { type: string }
        - in: path
          name: kind
          required: true
          schema:
            type: string
            enum: [poe, telegram, discord, slack, matrix, mattermost, msteams, whatsapp]
      responses:
        '200':
          description: Integration disconnected.

  /v1/hosted-agents/instances/{instance_id}/integrations/{kind}/behavior:
    post:
      tags: [Hosted Agents]
      summary: Set messaging channel behavior
      operationId: setHostedAgentIntegrationBehavior
      description: Controls per-connector chat signals for a connected messaging channel. Typing indicators and minimal processing reactions are on by default; read receipts and live reply preview are off by default. Live reply preview is opt-in on supported channels and separate from model API streaming.
      parameters:
        - in: path
          name: instance_id
          required: true
          schema: { type: string }
        - in: path
          name: kind
          required: true
          schema:
            type: string
            enum: [telegram, discord, slack, matrix, mattermost, msteams, whatsapp]
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: object
              properties:
                behavior:
                  $ref: '#/components/schemas/HostedAgentChannelBehavior'
      responses:
        '200':
          description: Channel behavior updated.
          content:
            application/json:
              schema:
                type: object
                properties:
                  ok: { type: boolean }
                  channel_behavior:
                    $ref: '#/components/schemas/HostedAgentChannelBehavior'

  /v1/hosted-agents/instances/{instance_id}/model:
    post:
      tags: [Hosted Agents]
      summary: Change a hosted agent's model or provider
      operationId: setHostedAgentModel
      description: Switches the agent's model in place (no redeploy). Use `provider` to point the agent at EmpirioLabs, Poe, or a custom OpenAI/Anthropic-compatible public HTTPS endpoint.
      parameters:
        - in: path
          name: instance_id
          required: true
          schema: { type: string }
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: object
              required: [provider, model]
              properties:
                provider:
                  type: string
                  enum: [empirio, poe, custom]
                model: { type: string }
                custom_provider:
                  type: object
                  description: Required when provider is custom. The public HTTPS base URL, model, API key, and protocol of your endpoint.
      responses:
        '200':
          description: Model changed.

  /v1/hosted-agents/instances/{instance_id}/access:
    post:
      tags: [Hosted Agents]
      summary: Set who can message a hosted agent
      operationId: setHostedAgentAccess
      description: "`mode` is `open` (anyone) or `restricted` (only allowed messaging IDs). With `restricted` and no IDs, the first person to message becomes the owner."
      parameters:
        - in: path
          name: instance_id
          required: true
          schema: { type: string }
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: object
              required: [mode]
              properties:
                kind:
                  type: string
                  enum: [telegram, discord, slack, matrix, mattermost, msteams, whatsapp]
                  description: Optional connected channel kind to update when sent in the JSON body.
                mode:
                  type: string
                  enum: [open, restricted]
                allow:
                  type: object
                  description: 'Per-channel allowlist of messaging IDs, e.g. { "telegram": ["123456"] }.'
      responses:
        '200':
          description: Access policy updated.

  /v1/hosted-agents/instances/{instance_id}/skills:
    get:
      tags: [Hosted Agents]
      summary: List a hosted agent's skills
      operationId: listHostedAgentSkills
      parameters:
        - in: path
          name: instance_id
          required: true
          schema: { type: string }
      responses:
        '200':
          description: Installed skill refs and any custom skills.
          content:
            application/json:
              schema:
                type: object
                properties:
                  installed:
                    type: array
                    items: { type: string }
                  customSkills:
                    type: object
    post:
      tags: [Hosted Agents]
      summary: Add a skill to a hosted agent
      operationId: addHostedAgentSkill
      description: "Add an official registry skill by ref (OpenClaw: `clawhub:<slug>`, Hermes: `hermes:<id>`), a git repo (`ocgit:<repo>`), a public HTTPS SKILL.md URL (`url:<https-url>`), or paste your own with `custom`."
      parameters:
        - in: path
          name: instance_id
          required: true
          schema: { type: string }
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: object
              properties:
                slug:
                  type: string
                  description: A skill ref (clawhub:, hermes:, ocgit:, or url:).
                custom:
                  type: object
                  description: 'A pasted skill: name plus SKILL.md body.'
      responses:
        '200':
          description: Skill added; returns the updated installed list.
    delete:
      tags: [Hosted Agents]
      summary: Remove a skill from a hosted agent
      operationId: removeHostedAgentSkill
      parameters:
        - in: path
          name: instance_id
          required: true
          schema: { type: string }
        - in: query
          name: slug
          required: true
          schema: { type: string }
          description: The skill ref to remove.
      responses:
        '200':
          description: Skill removed; returns the updated installed list.

  /v1/hosted-agents/instances/{instance_id}/mcp:
    get:
      tags: [Hosted Agents]
      summary: List a hosted agent's MCP connectors
      operationId: listHostedAgentMcp
      parameters:
        - in: path
          name: instance_id
          required: true
          schema: { type: string }
      responses:
        '200':
          description: Connected MCP servers (host-only; credentials are never returned).
    post:
      tags: [Hosted Agents]
      summary: Attach a remote MCP connector to a hosted agent
      operationId: addHostedAgentMcp
      parameters:
        - in: path
          name: instance_id
          required: true
          schema: { type: string }
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: object
              required: [name, url]
              properties:
                name: { type: string }
                url:
                  type: string
                  description: A public HTTPS remote MCP server URL.
                transport:
                  type: string
                  enum: [streamable-http, sse]
                header_name: { type: string }
                header_value: { type: string }
                tool_include:
                  type: array
                  items: { type: string }
      responses:
        '200':
          description: Connector attached.
    delete:
      tags: [Hosted Agents]
      summary: Remove an MCP connector from a hosted agent
      operationId: removeHostedAgentMcp
      parameters:
        - in: path
          name: instance_id
          required: true
          schema: { type: string }
        - in: query
          name: name
          required: true
          schema: { type: string }
          description: The connector name to remove.
      responses:
        '200':
          description: Connector removed.

components:
  securitySchemes:
    bearerAuth:
      type: http
      scheme: bearer
      bearerFormat: API key
      description: |
        Pass your EmpirioLabs API key as a bearer token. The Anthropic-style
        `x-api-key` header is also accepted on every endpoint.

  schemas:
    HostedAgentsConfig:
      type: object
      properties:
        plans:
          type: array
          items:
            $ref: '#/components/schemas/HostedAgentPlan'
        runtimes:
          type: array
          items:
            $ref: '#/components/schemas/HostedAgentRuntime'
        limit:
          type: integer
          description: Effective hosted-agent limit for the account.
          example: 3
        deployed:
          type: integer
          description: Current deployed hosted-agent count.
          example: 1
        balance:
          type: number
          format: float
        agentSpend:
          type: number
          format: float
    HostedAgentPlan:
      type: object
      properties:
        slug: { type: string, example: hosted-agent-basic }
        display_name: { type: string, example: Hosted Agent Basic }
        description: { type: string }
        monthly_price_usd: { type: number, format: float, example: 15 }
        included_messages_monthly: { type: integer, example: 10000 }
        default_replica_min: { type: integer, example: 1 }
        default_replica_max: { type: integer, example: 3 }
    HostedAgentRuntime:
      type: object
      properties:
        slug:
          type: string
          enum: [openclaw, hermes]
        display_name: { type: string, example: OpenClaw }
        default_plan_slug: { type: string, example: hosted-agent-basic }
        default_model: { type: string, example: qwen3-7-plus }
        default_provider:
          type: string
          enum: [empirio, poe]
    HostedAgentCreateRequest:
      type: object
      required: [agent_type]
      properties:
        agent_type:
          type: string
          enum: [openclaw, hermes]
        name:
          type: string
          example: OpenClaw Production
        plan_slug:
          type: string
          example: hosted-agent-basic
        default_provider:
          type: string
          enum: [empirio, poe, custom]
          default: empirio
        default_model:
          type: string
          example: qwen3-7-plus
        config:
          type: object
          description: Non-secret agent configuration.
        import_config:
          description: Existing OpenClaw or Hermes Agent config to import.
          oneOf:
            - type: object
            - type: string
    HostedAgentChannelBehavior:
      type: object
      description: Per-channel chat signal settings. `streaming` controls live reply preview on supported messaging channels, not model API streaming.
      properties:
        reactions:
          type: boolean
          default: true
          description: Whether the agent sends lightweight processing reactions.
        reaction_level:
          type: string
          enum: [off, ack, minimal, extensive]
          default: minimal
          description: How actively the agent uses platform reactions while working.
        typing:
          type: boolean
          default: true
          description: Whether supported platforms should show typing while the agent is working.
        read_receipts:
          type: boolean
          default: false
          description: Whether supported platforms should send read receipts. Telegram bots do not expose read receipts.
        streaming:
          type: boolean
          default: false
          description: Whether supported channels should show live reply previews with platform-native edit or draft preview support where available.
    HostedAgentInstance:
      type: object
      properties:
        id: { type: string }
        agent_type:
          type: string
          enum: [openclaw, hermes]
        name: { type: string }
        plan_slug: { type: string }
        status:
          type: string
          enum: [draft, provisioning, running, stopping, stopped, error, destroyed]
        default_provider:
          type: string
          enum: [empirio, poe, custom]
        billing_status: { type: string, example: active }
        connect_path: { type: string, example: /agents/00000000-0000-0000-0000-000000000000 }
        api_base_path: { type: string, example: /agents/00000000-0000-0000-0000-000000000000/v1 }
        replica_min: { type: integer, example: 1 }
        replica_max: { type: integer, example: 3 }
        replica_current: { type: integer, example: 1 }
        next_billing_at: { type: string, format: date-time }
        billed_amount: { type: number, format: float }
        messages_used_current_period: { type: integer }
    GpuType:
      type: object
      properties:
        slug: { type: string, example: rtx-4090 }
        name: { type: string, example: RTX 4090 }
        vram_gb: { type: integer, example: 24 }
        price_hourly:
          type: number
          format: float
          description: Listed price per GPU per hour. Billed by the second.
          example: 0.65
        available:
          type: boolean
          description: Whether capacity is currently sourceable for this GPU.
        available_count: { type: integer, example: 21 }
        max_gpus:
          type: integer
          description: Maximum GPUs of this type in a single GPU Cloud instance.
          example: 8
        regions:
          type: array
          items: { type: string }
    GpuDeployRequest:
      type: object
      required: [gpu_slug]
      properties:
        gpu_slug:
          type: string
          description: The GPU type to deploy from the catalog.
          example: rtx-4090
        mode:
          type: string
          enum: [model, template, custom]
          description: How to provision the GPU.
          example: model
        hf_id:
          type: string
          description: A Hugging Face repo id to serve with vLLM (mode `model`). Set `HF_TOKEN` in `env` for gated repos.
          example: Qwen/Qwen2.5-7B-Instruct
        template_slug:
          type: string
          description: A curated model or template slug (mode `model` or `template`).
          example: qwen25-7b
        image:
          type: string
          description: A CUDA Docker image to run (mode `custom`).
          example: pytorch/pytorch:2.4.0-cuda12.1-cudnn9-runtime
        ports:
          type: array
          items: { type: integer }
          description: Ports the workload listens on (mode `custom`).
        env:
          type: object
          additionalProperties: { type: string }
          description: Environment variables for the workload.
        num_gpus:
          type: integer
          default: 1
          minimum: 1
          maximum: 64
          description: Number of GPUs. Your current account limit is enforced at deploy and start time.
        disk_gb:
          type: integer
          default: 150
          minimum: 100
          maximum: 300
          description: Requested runtime disk in GB (100-300).
        name:
          type: string
          description: Optional label for the GPU Cloud instance.
    GpuInstance:
      type: object
      properties:
        id: { type: string }
        status:
          type: string
          enum: [provisioning, loading, running, stopping, stopped, error, destroyed]
          description: Current lifecycle status for the GPU Cloud instance.
        gpu_slug: { type: string, example: rtx-4090 }
        gpu_display: { type: string, example: RTX 4090 }
        num_gpus: { type: integer, example: 1 }
        image: { type: string, nullable: true }
        label: { type: string, nullable: true }
        disk_gb: { type: integer, nullable: true }
        price_hourly:
          type: number
          format: float
          description: Listed hourly price for this instance.
          example: 0.65
        connect_path:
          type: string
          nullable: true
          description: Default connect path for the workload.
          example: /v1
        billed_amount:
          type: number
          format: float
          description: Total spent on this GPU Cloud instance so far.
        created_at: { type: string, format: date-time, nullable: true }
        started_at: { type: string, format: date-time, nullable: true }
        error: { type: string, nullable: true }
    Model:
      type: object
      properties:
        id:
          type: string
          example: deepseek-v4-pro
        object:
          type: string
          example: model
        owned_by:
          type: string
          example: deepseek
        description:
          type: string
        type:
          type: string
          enum: [text, multimodal, image, video, audio, transcription, search, 3d, tools]
        context_window:
          type: integer
          nullable: true
        context_length:
          type: integer
          nullable: true
          description: OpenRouter-convention alias of `context_window`. Same value, surfaced under the field name third-party tools look for.
        input_modalities:
          type: array
          items: { type: string }
        output_modalities:
          type: array
          items: { type: string }
        pricing:
          description: |
            OpenRouter-style pricing. Single object for flat-rate models, or
            a 2-item array for context-tiered chat models. All values are
            USD strings. Per-token fields (`prompt`, `completion`,
            `input_cache_read`) are per-token, not per-million; flat fields
            (`image`, `request`) are per-event.

            For models with three or more context-based tiers the array is
            a best-effort 2-tier summary (cheapest base + highest-context
            tier). See `pricing_rows` for the full tier-by-tier breakdown.
          oneOf:
            - $ref: '#/components/schemas/PricingTier'
            - type: array
              minItems: 2
              maxItems: 2
              items:
                $ref: '#/components/schemas/PricingTier'
        pricing_rows:
          type: array
          description: EmpirioLabs' display-friendly breakdown including tool fees, per-image/second/minute rates, and full tier ladders. Use this when you need the complete picture.
          items:
            type: object
            properties:
              label: { type: string }
              spec: { type: string }
              value: { type: string }
        logo:
          type: string
          format: uri
        is_new:
          type: boolean
        discount:
          type: object
          nullable: true
          description: Display discount summary. When present, `label` uses the `Save up to X%` format and `percent` is the maximum advertised savings for the model.
          properties:
            label:
              type: string
              example: Save up to 20%
            percent:
              type: number
              example: 20

    PricingTier:
      type: object
      description: OpenRouter-style pricing tier. Per-token values are strings in USD.
      properties:
        prompt:
          type: string
          example: "0.0000005"
          description: Per-token input cost in USD.
        completion:
          type: string
          example: "0.000002"
          description: Per-token output cost in USD.
        image:
          type: string
          example: "0"
          description: Per-image cost (only on tier 0).
        request:
          type: string
          example: "0"
          description: Flat per-request cost (only on tier 0).
        input_cache_read:
          type: string
          example: "0"
          description: Per-token cost for cache-read input tokens.
        min_context:
          type: integer
          description: Present only on tier 1 of a tiered array. The input-token count at which this tier becomes active.
          example: 32000

    ModelDetail:
      allOf:
        - $ref: '#/components/schemas/Model'
        - type: object
          properties:
            pricing_rows:
              type: array
              items:
                type: object
                properties:
                  charge: { type: string }
                  spec: { type: string }
                  rate: { type: string }
            features:
              type: array
              items: { type: string }
            parameters:
              type: array
              items:
                type: object
                properties:
                  name: { type: string }
                  type:
                    type: string
                    enum: [string, number, integer, boolean, enum, array, object, file, image, video, audio]
                  required: { type: boolean }
                  description: { type: string }
                  default: {}
                  min: { type: number }
                  max: { type: number }
                  allowed_values:
                    type: array
                    items: {}

    ChatRequest:
      type: object
      required: [model, messages]
      properties:
        model:
          type: string
          example: deepseek-v4-pro
        messages:
          type: array
          items:
            type: object
            properties:
              role:
                type: string
                enum: [system, user, assistant, tool]
              content:
                oneOf:
                  - type: string
                  - type: array
                    items: { type: object }
        stream:
          type: boolean
          default: false
        temperature:
          type: number
          minimum: 0
          maximum: 2
        max_tokens:
          type: integer
        tools:
          type: array
          items: { type: object }
        tool_choice:
          oneOf:
            - type: string
              enum: [auto, none, required]
            - type: object
        response_format:
          type: object
        enable_thinking:
          type: boolean
          description: Enable model reasoning on models that advertise this parameter.
        thinking_budget:
          type: integer
          minimum: 1
          description: Maximum tokens reserved for model reasoning on models that advertise this parameter.
        disable_formatting:
          type: boolean
          default: false
          description: Return raw upstream response with no server-side formatting.

    ChatResponse:
      type: object
      properties:
        id: { type: string }
        object:
          type: string
          example: chat.completion
        created: { type: integer }
        model: { type: string }
        choices:
          type: array
          items:
            type: object
            properties:
              index: { type: integer }
              message:
                type: object
                properties:
                  role: { type: string }
                  content: { type: string }
                  tool_calls:
                    type: array
                    items: { type: object }
              finish_reason:
                type: string
                enum: [stop, length, tool_calls, content_filter]
        usage:
          $ref: '#/components/schemas/Usage'

    CompletionRequest:
      type: object
      required: [model, prompt]
      properties:
        model:
          type: string
          example: qwen3-5-9b
        prompt:
          oneOf:
            - type: string
            - type: array
              items:
                type: string
        stream:
          type: boolean
          default: false
        temperature:
          type: number
          minimum: 0
          maximum: 2
        max_tokens:
          type: integer
        top_p:
          type: number
          minimum: 0
          maximum: 1
        stop:
          oneOf:
            - type: string
            - type: array
              items:
                type: string
        logit_bias:
          type: object
          additionalProperties:
            type: number
          description: Bias token IDs by adding positive or negative values before sampling.

    CompletionResponse:
      type: object
      properties:
        id: { type: string }
        object:
          type: string
          example: text_completion
        created: { type: integer }
        model: { type: string }
        choices:
          type: array
          items:
            type: object
            properties:
              text: { type: string }
              index: { type: integer }
              finish_reason:
                type: string
                enum: [stop, length, content_filter]
        usage:
          $ref: '#/components/schemas/Usage'

    ResponseRequest:
      type: object
      required: [model, input]
      properties:
        model:
          type: string
        input:
          oneOf:
            - type: string
            - type: array
              items: { type: object }
        enable_thinking:
          type: boolean
          description: Enable model reasoning on models that advertise this parameter.
        thinking_budget:
          type: integer
          minimum: 1
          description: Maximum tokens reserved for model reasoning on models that advertise this parameter.

    MessageRequest:
      type: object
      required: [model, max_tokens, messages]
      properties:
        model:
          type: string
        max_tokens:
          type: integer
          example: 1024
        enable_thinking:
          type: boolean
          description: Enable model reasoning on models that advertise this parameter.
        thinking_budget:
          type: integer
          minimum: 1
          description: Maximum tokens reserved for model reasoning on models that advertise this parameter.
        thinking:
          type: object
          description: Anthropic-style thinking control. `budget_tokens` maps to `thinking_budget` for supported models.
          properties:
            type:
              type: string
              enum: [enabled, disabled]
            budget_tokens:
              type: integer
              minimum: 1
        system:
          oneOf:
            - type: string
            - type: array
              items: { type: object }
        messages:
          type: array
          items:
            type: object

    ImageResponse:
      type: object
      properties:
        created:
          type: integer
        data:
          type: array
          items:
            type: object
            properties:
              url:
                type: string
                format: uri
              b64_json:
                type: string
                description: Inline base64-encoded image bytes (only when response_format=b64_json).

    AudioResponse:
      type: object
      properties:
        data:
          type: array
          items:
            type: object
            properties:
              url:
                type: string
                format: uri
              duration:
                type: number
                description: Output duration in seconds.
              format:
                type: string
                example: mp3

    JobAccepted:
      type: object
      properties:
        job_id:
          type: string
          example: job_01HV3KABCDE
        status:
          type: string
          enum: [processing, completed, failed]
          example: processing
        poll_url:
          type: string
          example: /v1/jobs/job_01HV3KABCDE
        created_at:
          type: string
          format: date-time

    GenerationTemplate:
      type: object
      description: |
        A creative-effect template for `/v1/images/generations` or
        `/v1/videos/generations`. Apply by passing `template: "<slug>"`
        on the matching generation endpoint.
      properties:
        slug:
          type: string
          example: baseball-stadium
        display_name:
          type: string
          example: Stadium
        category:
          type: string
          enum: [viral, cinematic, motion, transform, social, extend, product, edit, portrait]
        description:
          type: string
        recommended_model:
          type: string
          example: kling-o3
        supported_models:
          type: array
          items:
            type: string
        default_params:
          type: object
          additionalProperties: true
          example:
            aspect_ratio: "16:9"
            duration: 5
        required_inputs:
          type: object
          properties:
            image:
              type: boolean
            video:
              type: boolean
            min_images:
              type: integer
            max_images:
              type: integer
            text_allowed:
              type: boolean
        metadata:
          type: object
          additionalProperties: true
          description: Template-level UI hints (cover positioning, suggested prompts, picker labels). Fields are customer-safe.
        cover_image_url:
          type: string
          nullable: true
        preview_video_url:
          type: string
          nullable: true
        modality:
          type: string
          enum: [video, image]
        is_featured:
          type: boolean
        display_order:
          type: integer
        updated_at:
          type: string
          format: date-time

    Job:
      type: object
      properties:
        job_id:
          type: string
        status:
          type: string
          enum: [processing, completed, failed]
        progress:
          type: number
          minimum: 0
          maximum: 1
        created_at:
          type: string
          format: date-time
        completed_at:
          type: string
          format: date-time
          nullable: true
        result:
          type: object
          nullable: true
        error:
          type: object
          nullable: true

    AccountUsageResponse:
      type: object
      properties:
        object:
          type: string
          example: account_usage
        balance:
          type: object
          properties:
            amount:
              type: number
              example: 42.5
            currency:
              type: string
              example: USD
            auto_topup_enabled:
              type: boolean
            auto_topup_threshold:
              type: number
              nullable: true
            auto_topup_amount:
              type: number
              nullable: true
            updated_at:
              type: string
              format: date-time
              nullable: true
        summary:
          type: object
          properties:
            requests:
              type: integer
            total_cost:
              type: number
            total_tokens:
              type: integer
            input_tokens:
              type: integer
            output_tokens:
              type: integer
            cache_read_tokens:
              type: integer
            cache_write_tokens:
              type: integer
            errors:
              type: integer
            avg_latency_ms:
              type: integer
              nullable: true
        spend:
          type: object
          description: |
            Account-level spend split by product over the requested window
            (`from` / `to`), or all-time when neither is set. This is distinct
            from `summary`, which only covers the returned page of usage events.
          properties:
            currency:
              type: string
              example: USD
            total:
              type: number
              description: Total spend across all products.
            model:
              type: number
              description: Model and API usage (chat, embeddings, media, search, and other endpoints).
            gpu_cloud:
              type: number
              description: GPU Cloud rental spend.
            hosted_agents:
              type: number
              description: Hosted Agents subscription spend.
            from:
              type: string
              format: date-time
              nullable: true
            to:
              type: string
              format: date-time
              nullable: true
        data:
          type: array
          items:
            $ref: '#/components/schemas/UsageEvent'
        has_more:
          type: boolean
        next_cursor:
          type: string
          format: date-time
          nullable: true

    UsageEvent:
      type: object
      properties:
        id:
          type: string
        object:
          type: string
          example: usage_event
        created_at:
          type: string
          format: date-time
        source:
          type: string
          enum: [api, playground, gpu_cloud, hosted_agents]
          description: >-
            Where the usage came from: `api` (direct API), `playground`,
            `gpu_cloud` (a GPU Cloud rental), or `hosted_agents` (a call made by
            one of your Hosted Agents).
        model:
          type: string
          nullable: true
        endpoint:
          type: string
          nullable: true
        status:
          type: string
          nullable: true
        status_code:
          type: integer
          nullable: true
        error:
          type: string
          nullable: true
        request_id:
          type: string
          nullable: true
        latency_ms:
          type: integer
          nullable: true
        tokens:
          type: object
          description: >-
            Token counters for token-billed usage. Non-token-billed events,
            such as GPU Cloud runtime rows, return zero counts and place runtime
            details in `metadata`.
          properties:
            input:
              type: integer
            output:
              type: integer
            cache_read:
              type: integer
            cache_write:
              type: integer
            total:
              type: integer
        cost:
          type: object
          properties:
            amount:
              type: number
            currency:
              type: string
              example: USD
        output_urls:
          type: array
          items:
            type: string
            format: uri
        metadata:
          type: object
          nullable: true
          description: >-
            Extra, non-sensitive context for the event. Internal audit fields
            (IP, user agent) and cost-internals are never included. Common keys
            are documented below; additional keys may be present.
          additionalProperties: true
          properties:
            category:
              type: string
              enum: [gpu_cloud, agents, playground]
              description: >-
                Product category for the event, when not a plain API call. Drives
                the `source` field (`agents` maps to `hosted_agents`).
            agent_type:
              type: string
              enum: [openclaw, hermes]
              description: For a Hosted Agent call, the agent runtime.
            agent_name:
              type: string
              description: For a Hosted Agent call, the agent's display name.
            agent_instance_id:
              type: string
              description: For a Hosted Agent call, the agent instance id.
            gpu_slug:
              type: string
              description: For a GPU Cloud event, the GPU SKU identifier.
            gpu_display:
              type: string
              description: For a GPU Cloud event, the customer-facing GPU name.
            seconds:
              type: number
              description: For a GPU Cloud runtime event, the billed runtime in seconds.
            price_hourly:
              type: number
              description: For a GPU Cloud runtime event, the hourly price in USD.
            conversation_id:
              type: string
              description: For a Playground request, the saved conversation id, when applicable.

    PlaygroundConversationList:
      type: object
      properties:
        object:
          type: string
          example: list
        data:
          type: array
          items:
            $ref: '#/components/schemas/PlaygroundConversationSummary'
        has_more:
          type: boolean
        next_cursor:
          type: string
          format: date-time
          nullable: true

    PlaygroundConversationSummary:
      type: object
      properties:
        id:
          type: string
        object:
          type: string
          example: playground_conversation
        model:
          type: string
        title:
          type: string
        message_count:
          type: integer
        created_at:
          type: string
          format: date-time
        updated_at:
          type: string
          format: date-time

    PlaygroundConversation:
      allOf:
        - $ref: '#/components/schemas/PlaygroundConversationSummary'
        - type: object
          properties:
            messages:
              type: array
              items:
                type: object

    Usage:
      type: object
      properties:
        prompt_tokens:
          type: integer
        completion_tokens:
          type: integer
        total_tokens:
          type: integer

    Error:
      type: object
      properties:
        error:
          type: object
          properties:
            message:
              type: string
            type:
              type: string
              example: invalid_request_error
            code:
              type: string
            param:
              type: string

  responses:
    BadRequest:
      description: Invalid request — see error message for the offending parameter.
      content:
        application/json:
          schema:
            $ref: '#/components/schemas/Error'
          examples:
            invalid_aspect_ratio:
              value:
                error:
                  message: "Aspect ratio is not supported by this model. Allowed: 16:9, 9:16, 1:1, ..."
                  type: invalid_request_error
                  code: invalid_parameter
                  param: aspect_ratio

    Unauthorized:
      description: Missing or invalid API key.
      content:
        application/json:
          schema:
            $ref: '#/components/schemas/Error'

    InsufficientCredits:
      description: Workspace is out of prepaid credits. Top up from the dashboard.
      content:
        application/json:
          schema:
            $ref: '#/components/schemas/Error'

    RateLimited:
      description: Rate limit exceeded for this API key. Retry with backoff.
      content:
        application/json:
          schema:
            $ref: '#/components/schemas/Error'

    NotFound:
      description: Resource not found (model id, job id, or file path).
      content:
        application/json:
          schema:
            $ref: '#/components/schemas/Error'


```