API Reference - ARouter

ARouter의 요청 및 응답 스키마는 OpenAI Chat API와 매우 유사하며 약간의 차이가 있습니다. ARouter는 모든 모델 및 프로바이더에 걸쳐 스키마를 통일하므로 하나만 학습하면 됩니다.

OpenAPI 명세

완전한 ARouter API는 OpenAPI 명세를 사용하여 문서화되어 있습니다:

YAML: https://api.arouter.ai/api-reference/openapi.yaml

ARouter는 현재 YAML 명세를 게시합니다. Swagger UI 또는 Postman과 같은 도구를 사용하여 API를 탐색하거나 클라이언트 라이브러리를 생성할 수 있습니다.

Base URL

https://api.arouter.ai

인증

모든 엔드포인트(/healthz 제외)는 다음 중 하나의 방법으로 인증이 필요합니다:

방법	Header / 파라미터	사용 대상
Bearer Token	`Authorization: Bearer <key>`	OpenAI SDK, 대부분의 클라이언트
API Key Header	`X-Api-Key: <key>`	Anthropic SDK
쿼리 파라미터	`?key=<key>`	Gemini SDK

자세한 내용은 인증 가이드를 참조하세요.

요청

요청 형식

다음은 TypeScript 타입으로 표현한 요청 스키마입니다. 이것은 /v1/chat/completions 엔드포인트로의 POST 요청의 본문이 됩니다. 전체 파라미터 목록은 파라미터 레퍼런스를 참조하세요.

type Request = {
  // ARouter standardizes chat requests on "messages"
  messages?: Message[];

  // If "model" is unspecified, defaults to the tenant's configured default
  model?: string; // e.g. "openai/gpt-5.4" or "anthropic/claude-sonnet-4.6"

  // Force the model to produce specific output format.
  // See "Structured Outputs" guide for supported models.
  response_format?: ResponseFormat;

  stop?: string | string[];
  stream?: boolean; // Enable streaming

  // See Parameters reference
  max_tokens?: number; // Range: [1, context_length)
  temperature?: number; // Range: [0, 2]

  // Tool calling
  // Passed through to providers implementing OpenAI's interface.
  // For providers with custom interfaces, transformed accordingly.
  tools?: Tool[];
  tool_choice?: ToolChoice;
  parallel_tool_calls?: boolean; // Default: true

  // Advanced optional parameters
  seed?: number;
  top_p?: number; // Range: (0, 1]
  top_k?: number; // Range: [1, Infinity) — not available for OpenAI models
  frequency_penalty?: number; // Range: [-2, 2]
  presence_penalty?: number; // Range: [-2, 2]
  repetition_penalty?: number; // Range: (0, 2]
  logit_bias?: { [key: number]: number };
  top_logprobs?: number;
  min_p?: number; // Range: [0, 1]
  top_a?: number; // Range: [0, 1]

  // Reduce latency by providing a predicted output
  prediction?: { type: 'content'; content: string };

  // ARouter routing parameters
  models?: string[]; // Ordered candidate model list — see Model Routing guide
  route?: string;    // Routing mode used with models[]

  // A stable identifier for your end-users (for abuse detection)
  user?: string;
};

ARouter의 OpenAI 호환 채팅 엔드포인트는 messages를 표준으로 합니다. 프로바이더의 네이티브 요청 본문을 전송하려면 해당 프로바이더의 네이티브 엔드포인트 또는 프로바이더 프록시를 사용하세요.

메시지 타입

type TextContent = {
  type: 'text';
  text: string;
};

type ImageContentPart = {
  type: 'image_url';
  image_url: {
    url: string; // URL or base64 encoded image data
    detail?: string; // Optional, defaults to "auto"
  };
};

type ContentPart = TextContent | ImageContentPart;

type Message =
  | {
      role: 'user' | 'assistant' | 'system';
      // ContentParts are only for the "user" role:
      content: string | ContentPart[];
      // If "name" is included, it will be prepended like this
      // for non-OpenAI models: `{name}: {content}`
      name?: string;
    }
  | {
      role: 'tool';
      content: string;
      tool_call_id: string;
      name?: string;
    };

도구 타입

type FunctionDescription = {
  description?: string;
  name: string;
  parameters: object; // JSON Schema object
};

type Tool = {
  type: 'function';
  function: FunctionDescription;
};

type ToolChoice =
  | 'none'
  | 'auto'
  | 'required'
  | {
      type: 'function';
      function: {
        name: string;
      };
    };

구조화된 출력

response_format 파라미터는 모델이 구조화된 JSON 응답을 반환하도록 강제합니다. ARouter는 두 가지 모드를 지원합니다:

{ type: 'json_object' }: 기본 JSON 모드 — 모델이 유효한 JSON을 반환합니다
{ type: 'json_schema', json_schema: { ... } }: 엄격한 스키마 모드 — 모델이 정확한 스키마에 맞는 JSON을 반환합니다

type ResponseFormat =
  | { type: 'json_object' }
  | {
      type: 'json_schema';
      json_schema: {
        name: string;
        strict?: boolean;
        schema: object; // JSON Schema object
      };
    };

자세한 사용법과 예시는 구조화된 출력을 참조하세요. JSON 모드 사용 시에도 시스템 또는 사용자 메시지에서 모델에게 JSON으로 응답하도록 지시해야 합니다. ARouter는 애플리케이션을 식별하기 위한 다음 선택적 header를 지원합니다:

HTTP-Referer: 앱의 URL, 대시보드의 소스 추적에 사용
X-Title: ARouter 분석에 사용되는 앱의 표시 이름

fetch('https://api.arouter.ai/v1/chat/completions', {
  method: 'POST',
  headers: {
    Authorization: 'Bearer lr_live_xxxx',
    'HTTP-Referer': 'https://myapp.com', // Optional
    'X-Title': 'My AI App',             // Optional
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    model: 'openai/gpt-5.4',
    messages: [{ role: 'user', content: 'Hello!' }],
  }),
});

자세한 내용은 요청 귀속을 참조하세요.

모델 라우팅

provider/model 형식으로 모델을 지정합니다. ARouter는 프로바이더 접두사를 파싱하여 올바른 업스트림으로 라우팅합니다. model 파라미터가 생략되면 테넌트에 구성된 기본값이 사용됩니다. models[] 및 route를 사용한 순서가 있는 후보 모델 목록을 포함한 전체 라우팅 로직은 모델 라우팅 가이드를 참조하세요.

스트리밍

stream: true를 설정하면 Server-Sent Events를 통해 토큰별로 응답을 받을 수 있습니다. SSE 스트림에는 data: 페이로드 외에 주석 줄이 포함될 수 있으며, 클라이언트는 해당 주석을 무시해야 합니다. SSE 형식, 사용량 청크, 취소, 오류 처리에 대해서는 스트리밍 가이드를 참조하세요.

비표준 파라미터

선택한 모델이 요청 파라미터를 지원하지 않는 경우(비 OpenAI 모델의 logit_bias 또는 OpenAI의 top_k 등), 해당 파라미터는 자동으로 무시됩니다. 나머지는 업스트림 모델 API로 전달됩니다.

어시스턴트 프리필

ARouter는 모델이 부분적인 응답을 완성하도록 요청하는 것을 지원합니다. messages 배열 끝에 role: "assistant" 메시지를 포함하세요:

fetch('https://api.arouter.ai/v1/chat/completions', {
  method: 'POST',
  headers: {
    Authorization: 'Bearer lr_live_xxxx',
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    model: 'openai/gpt-5.4',
    messages: [
      { role: 'user', content: 'What is the meaning of life?' },
      { role: 'assistant', content: "I'm not sure, but my best guess is" },
    ],
  }),
});

응답

응답 형식

ARouter는 모든 모델 및 프로바이더에 걸쳐 스키마를 통일하여 OpenAI Chat API를 준수합니다. choices는 항상 배열입니다. 스트리밍이 요청된 경우 각 choice에는 delta 속성이 포함되고, 그렇지 않으면 message 속성이 포함됩니다.

type Response = {
  id: string;
  choices: (NonStreamingChoice | StreamingChoice)[];
  created: number; // Unix timestamp
  model: string;   // The model that was actually used (e.g. "openai/gpt-5.4")
  object: 'chat.completion' | 'chat.completion.chunk';
  provider?: string; // The upstream provider that handled the request

  system_fingerprint?: string; // Only present if the provider supports it

  // Usage data is always returned for non-streaming.
  // When streaming, usage is returned exactly once in the final chunk,
  // before the [DONE] message, with an empty choices array.
  usage?: ResponseUsage;
};

Choice 타입

type NonStreamingChoice = {
  finish_reason: string | null; // Normalized finish reason
  native_finish_reason: string | null; // Raw finish reason from the provider
  message: {
    content: string | null;
    role: string;
    tool_calls?: ToolCall[];
  };
  error?: ErrorResponse;
};

type StreamingChoice = {
  finish_reason: string | null;
  native_finish_reason: string | null;
  delta: {
    content: string | null;
    role?: string;
    tool_calls?: ToolCall[];
  };
  error?: ErrorResponse;
};

type ToolCall = {
  id: string;
  type: 'function';
  function: {
    name: string;
    arguments: string; // JSON string
  };
};

type ErrorResponse = {
  code: number;
  message: string;
  metadata?: Record<string, unknown>;
};

사용량

type ResponseUsage = {
  /** Including images and tools if any */
  prompt_tokens: number;
  /** The tokens generated */
  completion_tokens: number;
  /** Sum of the above two fields */
  total_tokens: number;

  /** Breakdown of prompt tokens */
  prompt_tokens_details?: {
    cached_tokens: number;        // Tokens read from cache (cache hit)
    cache_write_tokens?: number;  // Tokens written to cache
    audio_tokens?: number;        // Tokens used for input audio
  };

  /** Breakdown of completion tokens */
  completion_tokens_details?: {
    reasoning_tokens?: number;              // Tokens generated for reasoning
    accepted_prediction_tokens?: number;    // Accepted predicted output tokens
    rejected_prediction_tokens?: number;    // Rejected predicted output tokens
    audio_tokens?: number;                  // Tokens generated for audio output
    image_tokens?: number;                  // Tokens generated for image output
  };
};

응답 예시

{
  "id": "chatcmpl-xxxxxxxxxxxxxx",
  "choices": [
    {
      "finish_reason": "stop",
      "native_finish_reason": "stop",
      "message": {
        "role": "assistant",
        "content": "Hello there!"
      }
    }
  ],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 4,
    "total_tokens": 14,
    "prompt_tokens_details": {
      "cached_tokens": 0
    },
    "completion_tokens_details": {
      "reasoning_tokens": 0
    }
  },
  "model": "openai/gpt-5.4",
  "provider": "openai",
  "object": "chat.completion",
  "created": 1748000000
}

종료 이유

ARouter는 각 모델의 finish_reason을 다음 값 중 하나로 통일합니다:

값	설명
`stop`	모델이 자연스럽게 완료
`length`	`max_tokens` 제한에 도달
`tool_calls`	모델이 도구를 호출하려 함
`content_filter`	콘텐츠 모더레이션 발동
`error`	생성 중 오류 발생

업스트림 프로바이더의 원시 종료 이유는 native_finish_reason으로 확인할 수 있습니다.

엔드포인트 그룹

OpenAI 호환

/v1/chat/completions, /v1/embeddings, /v1/modelsOpenAI 호환 SDK와 함께 사용. provider/model 라우팅 지원.

Anthropic 네이티브

/v1/messages, /v1/messages/batches, /v1/messages/count_tokensAnthropic SDK와 드롭인 호환.

Gemini 네이티브

/v1beta/models/{model}:generateContentGoogle Gemini SDK와 드롭인 호환.

API Key 관리

/api/v1/keysAPI key 생성, 목록 조회, 업데이트, 삭제.

결제

/api/v1/balance, /api/v1/transactions계정 잔액 및 거래 내역 조회.

프로바이더 프록시

/{provider}/{path}지원되는 프로바이더로 요청을 직접 프록시.

속도 제한

속도 제한은 API key별로 적용됩니다. 기본 제한은 대시보드 또는 관리 API를 통해 key별로 사용자 지정할 수 있습니다.

Header	설명
`X-RateLimit-Limit`	윈도우당 최대 요청 수
`X-RateLimit-Remaining`	남은 요청 수
`X-RateLimit-Reset`	윈도우 재설정 시간 (Unix 타임스탬프)

오류 응답

모든 오류는 일관된 JSON 형식을 따릅니다:

{
  "error": {
    "message": "description of what went wrong",
    "type": "error_type"
  }
}

오류 코드의 전체 목록과 재시도 전략은 오류 처리 가이드를 참조하세요.

Documentation Index

​OpenAPI 명세

​Base URL

​인증

​요청

​요청 형식

​메시지 타입

​도구 타입

​구조화된 출력

​선택적 요청 Header

​모델 라우팅

​스트리밍

​비표준 파라미터

​어시스턴트 프리필

​응답

​응답 형식

​Choice 타입

​사용량

​응답 예시

​종료 이유

​엔드포인트 그룹