API Reference

Complete REST surface — chat, embeddings, reranks, images, video, 3D, audio, transcription, search, detection, jobs

EmpirioLabs speaks OpenAI- and Anthropic-compatible request shapes. Drop in any SDK, point it at https://api.empiriolabs.ai, and authenticate with your EmpirioLabs API key. Every endpoint below works against any OpenAI or Anthropic client unchanged.

Authentication

Every request requires a bearer token. Either header is accepted on every endpoint:

Authorization: Bearer $EMPIRIOLABS_API_KEY
x-api-key: $EMPIRIOLABS_API_KEY
1from openai import OpenAI
2
3client = OpenAI(
4 base_url="https://api.empiriolabs.ai",
5 api_key="$EMPIRIOLABS_API_KEY",
6)
7
8resp = client.chat.completions.create(
9 model="deepseek-v4-pro",
10 messages=[{"role": "user", "content": "Hello!"}],
11)

Endpoint surface

Chat completions

OpenAI-compatible chat. Streaming, tool calling, vision, audio input, JSON mode, structured output, reasoning controls.

Legacy completions

OpenAI-compatible prompt completions for models that advertise POST /v1/completions.

Anthropic Messages

Drop-in for Anthropic SDK clients. tool_use / tool_result blocks round-trip cleanly.

Images

Generate, edit, inpaint, image variations. Hosted CDN URLs, 7-day signed.

Video

Async video generation. Returns a job_id; poll the jobs endpoint for the URL.

Audio (TTS, music, voices)

TTS plus real-time streaming TTS (Inworld), music / podcast / SFX generation, voice clone management.

Agents

Long-running tool-using agent tasks. Start, poll, stream messages, stop early.

Transcription

Whisper / Deepgram / Parakeet. Multipart upload or file_url.

Search and research

Exa, Tavily, Linkup, Perplexity Search. Domain filters, date ranges, geo bias.

3D generation

Async image-to-3D asset generation. Returns a job_id; poll for the signed GLB URL.

Detection

POST /v1/detect — GPTZero AI-detection, bibliography scan, source analysis.

Embeddings

OpenAI-compatible embeddings. Multilingual text + multimodal embedders.

Reranks

Semantic document reranking. Sort retrieval candidates by relevance for RAG and search refinement.

File URLs

Pass any public URL on input fields. No upload, no re-sign — generated outputs are valid for 7 days.

Jobs

Poll the status / result of any async generation. State retained 1 hour after completion.

Models

Live catalog with pricing, parameter schema, capability flags, regions.

Errors

OpenAI- and Anthropic-compatible error envelopes.

Chat completions

POST /v1/chat/completions

Pass any chat-capable model from the catalog as model. Streaming uses Server-Sent Events with data: ... lines and a final data: [DONE].

$curl https://api.empiriolabs.ai/v1/chat/completions \
> -H "Authorization: Bearer $EMPIRIOLABS_API_KEY" \
> -H "Content-Type: application/json" \
> -d '{
> "model": "deepseek-v4-pro",
> "messages": [{"role": "user", "content": "Summarize CRDT consistency in 3 bullets."}],
> "stream": true,
> "temperature": 0.7
> }'

Every model’s accepted parameters live on its docs page (e.g. temperature, top_p, enable_thinking, reasoning_effort, web_search_tier). Browse them under Providers and Models.

Model parameters across endpoints

Model-specific parameters advertised on the model page and in GET /v1/models/{id} can be sent to /v1/chat/completions, /v1/responses, and /v1/messages when that model supports the endpoint. The gateway adapts request shapes so the same controls reach the underlying model.

For thinking-capable models, enable_thinking and thinking_budget are accepted on all three text endpoints. On /v1/messages, you can also use Anthropic-style thinking:

1{
2 "thinking": {
3 "type": "enabled",
4 "budget_tokens": 1024
5 }
6}

That maps to the same enable_thinking=true and thinking_budget=1024 controls used by Chat Completions and Responses.

Legacy completions

POST /v1/completions

Use this endpoint for OpenAI-compatible clients that still send a raw prompt instead of chat messages. Only models that list POST /v1/completions in supported_endpoints accept this shape.

Streaming uses Server-Sent Events and includes usage when the model service reports it.

$curl https://api.empiriolabs.ai/v1/completions \
> -H "Authorization: Bearer $EMPIRIOLABS_API_KEY" \
> -H "Content-Type: application/json" \
> -d '{
> "model": "qwen3-5-9b",
> "prompt": "Write one concise launch sentence.",
> "max_tokens": 64,
> "stream": true
> }'

Anthropic Messages

POST /v1/messages

Drop-in for any Anthropic SDK client — the same models accessible on /v1/chat/completions and /v1/responses are reachable here under the Anthropic Messages shape.

$curl https://api.empiriolabs.ai/v1/messages \
> -H "x-api-key: $EMPIRIOLABS_API_KEY" \
> -H "anthropic-version: 2023-06-01" \
> -H "Content-Type: application/json" \
> -d '{
> "model": "deepseek-v4-pro",
> "max_tokens": 1024,
> "messages": [{"role": "user", "content": "Hi!"}]
> }'

tool_use and tool_result blocks round-trip cleanly. Mixed text-plus-tool_use content arrays are preserved.

Image generation

POST /v1/images/generations

$curl https://api.empiriolabs.ai/v1/images/generations \
> -H "Authorization: Bearer $EMPIRIOLABS_API_KEY" \
> -H "Content-Type: application/json" \
> -d '{
> "model": "wan-2-7-image",
> "prompt": "A glass cathedral at sunset, dramatic lighting",
> "aspect_ratio": "16:9",
> "resolution": "4K",
> "thinking_mode": true,
> "num_images": 4
> }'

Image-edit flows accept image: ["https://..."] with up to the model’s documented limit (3 for qwen-image-2-0, 9 for wan-2-7-image, 14 for seedream-5-0-lite). Image-set modes generate cohesive series — see each model’s page for the toggle.

Returned URLs live on https://media.empiriolabs.ai and expire after 7 days. Save anything you want to keep before the URL expires.

POST /v1/images/analysis runs vision-only analysis (no generation) on one or more input images. Use for layout extraction, object detection, OCR, and similar inspection tasks where the model returns text or JSON describing the image rather than a new picture.

Video generation

POST /v1/videos/generations

Always async — the endpoint returns a job_id and a polling URL.

$curl https://api.empiriolabs.ai/v1/videos/generations \
> -H "Authorization: Bearer $EMPIRIOLABS_API_KEY" \
> -H "Content-Type: application/json" \
> -d '{
> "model": "seedance-2-0-pro",
> "prompt": "A cinematic dolly shot of a city street at dusk in the rain",
> "resolution": "1080p",
> "aspect_ratio": "16:9",
> "duration": 8,
> "generate_audio": true
> }'

Audio generation

POST /v1/audio/speech synchronous, returns a hosted URL by default; pass response_format: "b64_json" for inline audio bytes.

POST /v1/audio/speech:stream real-time TTS. Returns Server-Sent Events as the model synthesizes. Sub-130ms time-to-first-byte on Inworld TTS Mini, sub-250ms on Max. Use for voice agents and interactive playback. Currently supported on Inworld TTS Mini / Max; other TTS models use the synchronous endpoint.

POST /v1/audio/generations music, podcast, and sound-effect generation. Covers Stable Audio, GLM TTS, MOSS, SoulX Podcast where the prompt-to-audio shape differs from TTS.

GET /v1/voices list and manage voices, including custom voice clones for Inworld TTS. Use the returned voice_id on either speech endpoint.

$curl https://api.empiriolabs.ai/v1/audio/speech \
> -H "Authorization: Bearer $EMPIRIOLABS_API_KEY" \
> -H "Content-Type: application/json" \
> -d '{
> "model": "gemini-2-5-flash-tts",
> "input": "Hello from EmpirioLabs.",
> "voice": "Charon",
> "output_format": "WAV"
> }'

Transcription

POST /v1/audio/transcriptions

Accepts either a multipart file upload or a JSON payload with file_url.

$curl https://api.empiriolabs.ai/v1/audio/transcriptions \
> -H "Authorization: Bearer $EMPIRIOLABS_API_KEY" \
> -F "model=openai-whisper-1" \
> -F "file=@meeting.mp3" \
> -F "response_format=verbose_json" \
> -F "timestamp_granularities=word,segment"

Long files (over 5 minutes) auto-route to the async job system — the response includes a job_id instead of inline text. Poll the jobs endpoint to retrieve the final transcript.

Search and research

POST /v1/search unified search surface for retrieval-style models. The exact accepted params per model live on each model’s page (e.g. exa-search exposes 28 params including category, livecrawl, subpages, summary_query, code_tokens).

POST /v1/research deep research / multi-step retrieval models (Exa Research, Perplexity Deep Research, Linkup Deep Search). Generates a structured research report with cited sources.

POST /v1/answer direct question-answering models (Exa Answer). Returns a concise answer plus citations without the full report shape.

$curl https://api.empiriolabs.ai/v1/search \
> -H "Authorization: Bearer $EMPIRIOLABS_API_KEY" \
> -H "Content-Type: application/json" \
> -d '{
> "model": "tavily-search",
> "query": "latest CRDT research papers 2026",
> "search_depth": "advanced",
> "include_answer": "advanced",
> "max_results": 10,
> "topic": "general"
> }'

Agents

Long-running, tool-using agent tasks (currently routed to Manus). Submit once, then poll for status and step-by-step messages, or stop early.

POST /v1/agents/run does double duty:

  • With no task_id it starts a fresh task. The response carries the new task_id.
  • With task_id it sends a follow-up message to an existing task. The agent picks it up on its next reasoning step.

GET /v1/agents/{task_id} retrieve the task’s current status and final result.

GET /v1/agents/{task_id}/messages list every step the agent has emitted so far. Useful for rendering a live reasoning trace alongside the final answer.

POST /v1/agents/{task_id}/stop stop a running task. Billing settles for the work the agent already completed.

$curl https://api.empiriolabs.ai/v1/agents/run \
> -H "Authorization: Bearer $EMPIRIOLABS_API_KEY" \
> -H "Content-Type: application/json" \
> -d '{
> "model": "manus",
> "input": "Summarize the top 5 humanoid robotics startups by funding raised in 2025-2026"
> }'

3D Generation

POST /v1/3d/generations

Image-to-3D generation is async. The endpoint returns a job_id and a polling URL; poll the jobs endpoint to retrieve the final signed GLB URL.

$curl https://api.empiriolabs.ai/v1/3d/generations \
> -H "Authorization: Bearer $EMPIRIOLABS_API_KEY" \
> -H "Content-Type: application/json" \
> -d '{
> "model": "trellis-2-4b",
> "image_url": "https://example.com/product-photo.png",
> "resolution": "1024",
> "texture_size": "2048",
> "decimation_target": 500000,
> "seed": 42
> }'

trellis-2-4b exposes the full image, resolution, sampler, texture, and mesh export parameter surface on its model page.

Detection

POST /v1/detect

Specialized text-classification endpoint. Currently powers GPTZero (AI-detection, bibliography scan, source analysis). Each model’s scan_type enum picks the upstream path; see the per-model docs for the full parameter surface.

$curl https://api.empiriolabs.ai/v1/detect \
> -H "Authorization: Bearer $EMPIRIOLABS_API_KEY" \
> -H "Content-Type: application/json" \
> -d '{
> "model": "gptzero",
> "input": "The quick brown fox jumps over the lazy dog.",
> "scan_type": "ai_detection",
> "multilingual": false
> }'

GPTZero is also reachable via /v1/chat/completions and /v1/responses — pass the text on the message body and the gateway adapts the call. The detection summary comes back as the assistant message; pass disable_formatting: true to receive the raw upstream JSON instead.

Embeddings

POST /v1/embeddings

OpenAI-compatible embeddings. Multilingual text and multimodal (text + image + video) embedders are available.

$curl https://api.empiriolabs.ai/v1/embeddings \
> -H "Authorization: Bearer $EMPIRIOLABS_API_KEY" \
> -H "Content-Type: application/json" \
> -d '{
> "model": "text-embedding-v4",
> "input": ["Sentence one.", "Sentence two."],
> "encoding_format": "float"
> }'

Reranks

POST /v1/reranks

Sort candidate documents by semantic relevance to a query. Returns each document’s original index plus a 0-1 relevance score (higher = more relevant). Use this to tighten the output of a vector store / BM25 / hybrid retriever before passing the top hits to a language model — the standard last step in a RAG pipeline.

$curl https://api.empiriolabs.ai/v1/reranks \
> -H "Authorization: Bearer $EMPIRIOLABS_API_KEY" \
> -H "Content-Type: application/json" \
> -d '{
> "model": "qwen3-rerank",
> "query": "What is a rerank model?",
> "documents": [
> "Rerank models sort candidate documents by relevance.",
> "Quantum computing is a cutting-edge field of computer science.",
> "Pre-trained language models advanced rerank models."
> ],
> "top_n": 2,
> "return_documents": true
> }'

The optional instruct parameter swaps between Q&A retrieval (default) and pure semantic-similarity sorting — see the qwen3-rerank model page for the full parameter table.

Usage object

Every endpoint that bills usage returns a usage field on the response (and on the terminal streaming chunk). Base shape:

  • cost_usd — exact amount your account was billed for the request. Authoritative.
  • prompt_tokens / completion_tokens / total_tokens — for chat-style models.
  • Cache fields (cache_read_input_tokens, cache_creation_input_tokens) — when prompt caching applies.

Models with tiered, per-call, or variant-priced upstreams stamp extra fields on usage so you can see which rate was applied:

  • Tier / variant pricing. Workers stamp a tier discriminator on usage when the same dimension has more than one rate. The primary field is pricing_tier_label (human-readable, e.g. "Medium context" / "Pro" / "2K"). Older workers may stamp the raw dimension directly instead (resolution, quality, mode, rate_tier). The dashboard renders the badge from whichever is present.
  • Per-call pricing. Workers that bill per tool invocation (search, fetch, code execution, etc.) stamp counts under tool_calls_details.<tool>.invocation or tool_usage.<tool>. The dashboard expands these into a per-tool breakdown automatically.
  • Per-dimension pricing. Workers that bill multiple dimensions in one request (e.g. citation tokens + reasoning tokens + search queries on deep-research models) stamp each dimension as its own field (citation_tokens, reasoning_tokens, num_search_queries, etc.).

The same fields drive the tier badge and per-tool breakdown on the dashboard usage logs, and they are also returned by the GET /v1/account/usage history endpoint under each event’s metadata.worker_usage (plus a structured tool_breakdown array for per-call models). So whether you read live response usage, account-usage history, or your dashboard, the tier and billing breakdown match exactly.

File URLs

EmpirioLabs does not host user uploads. Pass any public URL directly on the input field of the model endpoint:

Endpoint familyInput fieldAccepts
Chat completions (vision)content[].image_url.urlAny public image URL or data: URI
Chat completions (audio)content[].audio_url.urlAny public audio URL
Image generation (edit / variation)image: ["https://..."]Up to N URLs (model-specific limit)
Video generation (i2v / r2v / edit)image: "https://..." / video: "https://..."Public URLs
Audio TTS / musicn/a (text-only input)
Audio transcriptionfile_url: "https://..." or multipart file=@local.mp3Public URL or direct upload of short clips
Searchn/a (query text)
Embeddings (multimodal)input[].url (image/video item)Public URLs
Reranksn/a (text query + text documents)

For audio transcription specifically, the multipart-direct upload on /v1/audio/transcriptions is the supported path for private clips that aren’t on a URL — those bytes flow straight to the speech-to-text worker without persistent storage.

Generated output URLs are signed and expire 7 days after creation. There is no re-sign endpoint. Save anything you need to keep — both the URL and the binary — within that window.

Async jobs

GET /v1/jobs/<job-id> — poll the status / final result of any async generation or transcription job.

Job state is retained for 1 hour after completion.

Job state shape
1{
2 "job_id": "job_01HV...",
3 "status": "processing | completed | failed",
4 "progress": 0.42,
5 "created_at": "2026-05-02T17:11:32Z",
6 "completed_at": null,
7 "result": null,
8 "error": null
9}

When status is completed, the result field carries the full response in the same shape the synchronous endpoint would have returned.

Inbound HTTP timeout is 15 minutes. Synchronous chat completions running close to that limit should set stream=true so partial output flows back and the connection stays warm.

Models

GET /v1/models — list every available model.

GET /v1/models/<model-id> — full schema for one model, including its parameter table.

GET /v1/models?format=openrouter returns the OpenRouter model-listing shape for models marked ready for partner ingestion. See OpenRouter Model Listing for the exact response fields.

Each model returns:

FieldDescription
idCanonical slug (e.g. wan-2-7-image)
descriptionShort marketing description
categorytext / image / video / audio / transcription / research / tools / embedding / reranker
input_modalitiesWhat the model accepts
output_modalitiesWhat the model emits
context_windowTokens (chat) or null (media)
regionServer region
logoCDN URL to the model logo
pricing_rowsPer-token, per-image, per-second, or per-call rates
supported_endpointsWhich /v1/... endpoints accept this model
parametersFull schema — name, type, default, min/max, allowed values, descriptions
featuresTags like streaming, vision, tool_calling, voice_cloning
$curl https://api.empiriolabs.ai/v1/models | jq '.data[0]'
$curl https://api.empiriolabs.ai/v1/models/wan-2-7-image | jq '.parameters'

disable_formatting flag

Many chat, search, research, and rerank endpoints accept a disable_formatting=true flag. When set on a supporting model, the worker skips EmpirioLabs server-side formatting (citation rewriting, References block, thinking-block Markdown, etc.) and returns the upstream payload shape verbatim.

Coverage is advertised per-model. Check supports_passthrough in GET /v1/models/{id} to confirm a specific model honors the flag. Models that advertise supports_passthrough: true also accept the aliases raw=true, passthrough=true, and raw_response=true. Models without that field accept only the canonical disable_formatting=true form (or do not honor passthrough at all). The model card lists which aliases each model accepts.

Image, video, audio-generation, transcription, and embedding endpoints do not accept this flag, since there is no formatting layer to disable on those endpoints.

Generated media retention

Generated images, videos, and audio are returned as signed URLs that are valid for 7 days. After that, the URL stops working and the asset is gone — there is no re-sign endpoint. Save anything you want to keep before the 7-day window expires.

Errors

OpenAI envelope on chat / responses / images / videos / audio / search / embeddings / reranks:

1{
2 "error": {
3 "message": "Aspect ratio is not supported by this model. Allowed: 16:9, 9:16, 1:1, ...",
4 "type": "invalid_request_error",
5 "code": "invalid_parameter",
6 "param": "aspect_ratio"
7 }
8}

Anthropic envelope on /v1/messages:

1{ "type": "error", "error": { "type": "invalid_request_error", "message": "..." } }

Headers reference

HeaderRequiredPurpose
Authorization / x-api-keyyesBearer token authentication
Content-Typeyes on POSTapplication/json or multipart/form-data
Acceptnotext/event-stream to force SSE on chat endpoints
anthropic-versionwhen calling /v1/messagesAnthropic API version (e.g. 2023-06-01)

Browse the per-model parameter schemas under Providers and Models. When you click into a specific model, every parameter the model accepts — type, default, range, allowed values, conditional flags — is documented in a table generated from the live database.