Usage in Responses
Theusage object is returned in every non-streaming response (and in the final chunk of streaming responses):
Token Fields
| Field | Description |
|---|---|
prompt_tokens | Total input tokens (includes cached tokens) |
completion_tokens | Total output tokens (includes reasoning tokens) |
total_tokens | Sum of prompt + completion tokens |
prompt_tokens_details.cached_tokens | Tokens served from the provider’s prompt cache |
completion_tokens_details.reasoning_tokens | Tokens used for internal reasoning (thinking models) |
completion_tokens_details.accepted_prediction_tokens | Speculative decoding tokens accepted |
completion_tokens_details.rejected_prediction_tokens | Speculative decoding tokens rejected |
Cost Tracking
ARouter bills based on the actual token counts reported by the upstream provider. Pricing is passed through at cost — no inference markup. To calculate exact cost:Streaming Usage
In streaming mode, the final SSE chunk includes the full usage object with emptychoices:
stream_options: { include_usage: true } in your request.
Dashboard Reporting
All usage data is visible in the Activity page with filtering by:- Time period (1 hour → 1 year)
- Grouping (Model, API Key, Creator)