API Reference - ARouter

ARouter’s request and response schemas are very similar to the OpenAI Chat API, with a few small differences. ARouter normalizes the schema across models and providers so you only need to learn one.

OpenAPI Specification

The complete ARouter API is documented using the OpenAPI specification:

YAML: https://api.arouter.ai/api-reference/openapi.yaml

ARouter currently publishes the YAML specification. You can use it with tools like Swagger UI or Postman to explore the API or generate client libraries.

Base URL

https://api.arouter.ai

Authentication

All endpoints (except /healthz) require authentication via one of:

Method	Header / Parameter	Used By
Bearer Token	`Authorization: Bearer <key>`	OpenAI SDK, most clients
API Key Header	`X-Api-Key: <key>`	Anthropic SDK
Query Parameter	`?key=<key>`	Gemini SDK

See the Authentication Guide for details.

Requests

Request Format

Here is the request schema as a TypeScript type. This will be the body of your POST request to the /v1/chat/completions endpoint. For a complete list of parameters, see the Parameters reference.

type Request = {
  // ARouter standardizes chat requests on "messages"
  messages?: Message[];

  // If "model" is unspecified, defaults to the tenant's configured default
  model?: string; // e.g. "openai/gpt-5.4" or "anthropic/claude-sonnet-4.6"

  // Force the model to produce specific output format.
  // See "Structured Outputs" guide for supported models.
  response_format?: ResponseFormat;

  stop?: string | string[];
  stream?: boolean; // Enable streaming

  // See Parameters reference
  max_tokens?: number; // Range: [1, context_length)
  temperature?: number; // Range: [0, 2]

  // Tool calling
  // Passed through to providers implementing OpenAI's interface.
  // For providers with custom interfaces, transformed accordingly.
  tools?: Tool[];
  tool_choice?: ToolChoice;
  parallel_tool_calls?: boolean; // Default: true

  // Advanced optional parameters
  seed?: number;
  top_p?: number; // Range: (0, 1]
  top_k?: number; // Range: [1, Infinity) — not available for OpenAI models
  frequency_penalty?: number; // Range: [-2, 2]
  presence_penalty?: number; // Range: [-2, 2]
  repetition_penalty?: number; // Range: (0, 2]
  logit_bias?: { [key: number]: number };
  top_logprobs?: number;
  min_p?: number; // Range: [0, 1]
  top_a?: number; // Range: [0, 1]

  // Reduce latency by providing a predicted output
  prediction?: { type: 'content'; content: string };

  // ARouter routing parameters
  models?: string[]; // Ordered candidate model list — see Model Routing guide
  route?: string;    // Routing mode used with models[]

  // A stable identifier for your end-users (for abuse detection)
  user?: string;
};

ARouter’s OpenAI-compatible chat endpoint standardizes on messages. If you want to send a provider’s native request body instead, use the native endpoint for that provider or the Provider Proxy.

Message Types

type TextContent = {
  type: 'text';
  text: string;
};

type ImageContentPart = {
  type: 'image_url';
  image_url: {
    url: string; // URL or base64 encoded image data
    detail?: string; // Optional, defaults to "auto"
  };
};

type ContentPart = TextContent | ImageContentPart;

type Message =
  | {
      role: 'user' | 'assistant' | 'system';
      // ContentParts are only for the "user" role:
      content: string | ContentPart[];
      // If "name" is included, it will be prepended like this
      // for non-OpenAI models: `{name}: {content}`
      name?: string;
    }
  | {
      role: 'tool';
      content: string;
      tool_call_id: string;
      name?: string;
    };

Tool Types

type FunctionDescription = {
  description?: string;
  name: string;
  parameters: object; // JSON Schema object
};

type Tool = {
  type: 'function';
  function: FunctionDescription;
};

type ToolChoice =
  | 'none'
  | 'auto'
  | 'required'
  | {
      type: 'function';
      function: {
        name: string;
      };
    };

Structured Outputs

The response_format parameter enforces structured JSON responses from the model. ARouter supports two modes:

{ type: 'json_object' }: Basic JSON mode — the model returns valid JSON
{ type: 'json_schema', json_schema: { ... } }: Strict schema mode — the model returns JSON matching your exact schema

type ResponseFormat =
  | { type: 'json_object' }
  | {
      type: 'json_schema';
      json_schema: {
        name: string;
        strict?: boolean;
        schema: object; // JSON Schema object
      };
    };

For detailed usage and examples, see Structured Outputs. When using JSON mode, you should still instruct the model to respond with JSON in your system or user message.

Optional Request Headers

ARouter supports these optional headers to identify your application:

HTTP-Referer: Your app’s URL, used for source tracking in the Dashboard
X-Title: Your app’s display name used in ARouter analytics

fetch('https://api.arouter.ai/v1/chat/completions', {
  method: 'POST',
  headers: {
    Authorization: 'Bearer lr_live_xxxx',
    'HTTP-Referer': 'https://myapp.com', // Optional
    'X-Title': 'My AI App',             // Optional
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    model: 'openai/gpt-5.4',
    messages: [{ role: 'user', content: 'Hello!' }],
  }),
});

See Request Attribution for details.

Model Routing

Specify the model using the provider/model format. ARouter parses the provider prefix and routes to the correct upstream. If the model parameter is omitted, the tenant’s configured default is used. See the Model Routing guide for the full routing logic, including ordered candidate model lists using models[] and route.

Streaming

Set stream: true to receive token-by-token responses via Server-Sent Events. SSE streams can include comment lines in addition to data: payloads, and clients should ignore those comments. See the Streaming guide for SSE format, usage chunks, cancellation, and error handling.

Non-Standard Parameters

If the chosen model doesn’t support a request parameter (such as logit_bias in non-OpenAI models, or top_k for OpenAI), the parameter is silently ignored. The rest are forwarded to the upstream model API.

Assistant Prefill

ARouter supports asking models to complete a partial response. Include a message with role: "assistant" at the end of your messages array:

fetch('https://api.arouter.ai/v1/chat/completions', {
  method: 'POST',
  headers: {
    Authorization: 'Bearer lr_live_xxxx',
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    model: 'openai/gpt-5.4',
    messages: [
      { role: 'user', content: 'What is the meaning of life?' },
      { role: 'assistant', content: "I'm not sure, but my best guess is" },
    ],
  }),
});

Responses

Response Format

ARouter normalizes the schema across models and providers to comply with the OpenAI Chat API. choices is always an array. Each choice contains a delta property if streaming was requested, and a message property otherwise.

type Response = {
  id: string;
  choices: (NonStreamingChoice | StreamingChoice)[];
  created: number; // Unix timestamp
  model: string;   // The model that was actually used (e.g. "openai/gpt-5.4")
  object: 'chat.completion' | 'chat.completion.chunk';
  provider?: string; // The upstream provider that handled the request

  system_fingerprint?: string; // Only present if the provider supports it

  // Usage data is always returned for non-streaming.
  // When streaming, usage is returned exactly once in the final chunk,
  // before the [DONE] message, with an empty choices array.
  usage?: ResponseUsage;
};

Choice Types

type NonStreamingChoice = {
  finish_reason: string | null; // Normalized finish reason
  native_finish_reason: string | null; // Raw finish reason from the provider
  message: {
    content: string | null;
    role: string;
    tool_calls?: ToolCall[];
  };
  error?: ErrorResponse;
};

type StreamingChoice = {
  finish_reason: string | null;
  native_finish_reason: string | null;
  delta: {
    content: string | null;
    role?: string;
    tool_calls?: ToolCall[];
  };
  error?: ErrorResponse;
};

type ToolCall = {
  id: string;
  type: 'function';
  function: {
    name: string;
    arguments: string; // JSON string
  };
};

type ErrorResponse = {
  code: number;
  message: string;
  metadata?: Record<string, unknown>;
};

Usage

type ResponseUsage = {
  /** Including images and tools if any */
  prompt_tokens: number;
  /** The tokens generated */
  completion_tokens: number;
  /** Sum of the above two fields */
  total_tokens: number;

  /** Breakdown of prompt tokens */
  prompt_tokens_details?: {
    cached_tokens: number;        // Tokens read from cache (cache hit)
    cache_write_tokens?: number;  // Tokens written to cache
    audio_tokens?: number;        // Tokens used for input audio
  };

  /** Breakdown of completion tokens */
  completion_tokens_details?: {
    reasoning_tokens?: number;              // Tokens generated for reasoning
    accepted_prediction_tokens?: number;    // Accepted predicted output tokens
    rejected_prediction_tokens?: number;    // Rejected predicted output tokens
    audio_tokens?: number;                  // Tokens generated for audio output
    image_tokens?: number;                  // Tokens generated for image output
  };
};

Response Example

{
  "id": "chatcmpl-xxxxxxxxxxxxxx",
  "choices": [
    {
      "finish_reason": "stop",
      "native_finish_reason": "stop",
      "message": {
        "role": "assistant",
        "content": "Hello there!"
      }
    }
  ],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 4,
    "total_tokens": 14,
    "prompt_tokens_details": {
      "cached_tokens": 0
    },
    "completion_tokens_details": {
      "reasoning_tokens": 0
    }
  },
  "model": "openai/gpt-5.4",
  "provider": "openai",
  "object": "chat.completion",
  "created": 1748000000
}

Finish Reason

ARouter normalizes each model’s finish_reason to one of the following values:

Value	Description
`stop`	Model completed naturally
`length`	`max_tokens` limit reached
`tool_calls`	Model wants to call a tool
`content_filter`	Content moderation triggered
`error`	An error occurred during generation

The raw finish reason from the upstream provider is available via native_finish_reason.

Endpoint Groups

OpenAI Compatible

/v1/chat/completions, /v1/embeddings, /v1/modelsUse with any OpenAI-compatible SDK. Supports provider/model routing.

Anthropic Native

/v1/messages, /v1/messages/batches, /v1/messages/count_tokensDrop-in compatible with the Anthropic SDK.

Gemini Native

/v1beta/models/{model}:generateContentDrop-in compatible with the Google Gemini SDK.

Key Management

/api/v1/keysCreate, list, update, and delete API keys.

Billing

/api/v1/balance, /api/v1/transactionsQuery account balance and transaction history.

Provider Proxy

/{provider}/{path}Proxy requests directly to any supported provider.

Rate Limits

Rate limits are applied per API key. Default limits can be customized per key via the Dashboard or the management API.

Header	Description
`X-RateLimit-Limit`	Maximum requests per window
`X-RateLimit-Remaining`	Requests remaining
`X-RateLimit-Reset`	Window reset time (Unix timestamp)

Error Responses

All errors follow a consistent JSON format:

{
  "error": {
    "message": "description of what went wrong",
    "type": "error_type"
  }
}

See the Error Handling guide for the full list of error codes and retry strategies.

​OpenAPI Specification

​Base URL

​Authentication

​Requests

​Request Format

​Message Types

​Tool Types

​Structured Outputs

​Optional Request Headers

​Model Routing

​Streaming

​Non-Standard Parameters

​Assistant Prefill

​Responses

​Response Format

​Choice Types

​Usage

​Response Example

​Finish Reason

​Endpoint Groups