Skip to content

Models & catalog

The catalog is the source of truth for which models BatchRouter can route to, what each one costs per provider, and which hosted tools it supports. Read it before you quote or submit a batch so you reference a model slug that actually exists and matches your operation, region, and privacy needs.

GET /v1/catalog/models is public and requires no authentication — call it from anywhere, including the browser, to discover available models.

The base call returns every model with its routing eligibility, context limits, and per-provider pricing.

Terminal window
curl https://api.batchrouter.com/v1/catalog/models

The response is { data: CatalogModel[], provider_count, pricing_updated_at }.

Narrow the list with up to three query parameters. They combine with AND.

ParameterValuesFilters to
operationresponses, embeddings, visionModels that support that operation type
providera provider slug (e.g. openai, anthropic)Models offered by that provider
hosted_toolweb_search, python_execution, calculator, time, file_search, retrievalModels whose offering advertises that first-class hosted tool

For example, list responses-capable models that can run web search:

Terminal window
curl "https://api.batchrouter.com/v1/catalog/models?operation=responses&hosted_tool=web_search"

Each CatalogModel describes the model once at the top level, then carries a provider_offerings array — one snapshot per provider that serves the model. The same model (for example, an open-weight model) can appear under several providers at different prices and in different regions; BatchRouter routes across all of them.

Top-level fields you’ll use most:

  • slug — the identifier you pass as model to quotes and batches.
  • display_name, provider — human label and the primary provider slug.
  • operations — the operation types this model supports (responses, embeddings, vision).
  • context_window, max_input_tokens, max_output_tokens — token limits.
  • input_modalities — accepted inputs (text, image, document, audio, video).
  • supports_file_uploads, accepted_file_types, max_file_size — whether you can reference files from POST /v1/files, the MIME types accepted, and the max upload size in bytes.
  • hosted_tools — the hosted tools available with this model (the same enum as the hosted_tool filter).
  • is_available — whether the model is currently routable.

Each entry in provider_offerings is where per-provider routing detail lives:

  • Pricingprice (per-provider, provider-declared rates). The top-level pricing_updated_at tells you how fresh the price data is.
  • Capacitycapacity_status (active, draining, paused), available_queue_items, available_queue_tokens, and a capacity_freshness object (fresh / stale_heartbeat / unknown). Lanes marked stale_heartbeat are not treated as live routable supply.
  • Regionsregions the offering runs in (used by region_unavailable rejection reasons).
  • Data retention & privacyretention_days, supports_zdr (zero data retention), data_collection (allow / deny), training_use, privacy_tiers, and privacy_terms_url.
  • Hosted tools & fileshosted_tools, runtime_capabilities, accepted_file_types, supports_file_uploads.
  • Statusstatus (active, paused, deprecated, disabled) and metadata_verification_status (verified / unverified / undeclared).

The shape below is illustrative and abridged — real responses contain more fields and many models. Always read live data from GET /v1/catalog/models; prices and capacity change frequently.

{
"data": [
{
"slug": "gpt-4o-mini",
"display_name": "GPT-4o mini",
"provider": "openai",
"operations": ["responses", "vision"],
"context_window": 128000,
"max_output_tokens": 16384,
"input_modalities": ["text", "image"],
"supports_file_uploads": true,
"accepted_file_types": ["image/*"],
"hosted_tools": ["web_search"],
"is_available": true,
"provider_offerings": [
{
"provider": "openai",
"provider_name": "OpenAI",
"task_type": "responses",
"context_window": 128000,
"price": { "currency": "usd", "input_per_mtok": "0.15", "output_per_mtok": "0.60" },
"capacity_status": "active",
"regions": ["us", "eu"],
"retention_days": 0,
"supports_zdr": true,
"hosted_tools": ["web_search"],
"status": "active"
}
]
}
],
"provider_count": 1,
"pricing_updated_at": "2026-06-17T09:00:00Z"
}

BatchRouter accepts two ways to choose a model in a quote (POST /v1/quotes/model) or batch (POST /v1/batches):

  1. Pin one model. Set model to a single slug when you need a specific model. BatchRouter still routes across every provider that offers that slug and picks the cheapest eligible lane.

    { "model": "gpt-4o-mini" }
  2. Offer a fallback list. Set models to an array of candidate slugs and BatchRouter picks the cheapest one that satisfies your constraints (capacity, region, privacy tier, required tools). Use this when several models would do and price is what matters.

    { "models": ["gpt-4o-mini", "claude-haiku", "llama-3.1-8b"] }

The canonical item you’ll quote against:

{"customer_item_id":"item-1","operation":"responses","model":"gpt-4o-mini","input":{"messages":[{"role":"user","content":"Summarize: BatchRouter routes batch-AI workloads across providers."}]}}

If your workload needs a model that can call hosted tools, add required_tools to the quote or batch. BatchRouter only routes to lanes whose offering supports every requested tool; lanes that can’t are returned with a tool_support rejection reason in quote_lanes.

{
"models": ["gpt-4o-mini", "claude-haiku"],
"routing_mode": "cheapest",
"required_tools": ["web_search", "python_execution"]
}

Valid tools are web_search, python_execution, calculator, time, file_search, and retrieval — the same set returned in each model’s hosted_tools. To pre-filter the catalog to tool-capable models before you quote, combine required_tools with the catalog’s hosted_tool query filter above.