How It Works
ARouter tracks response times, error rates, and availability across all providers in real time. This data drives intelligent routing decisions and helps surface reliability information in your Activity feed. When a provider experiences degraded performance or an outage, ARouter automatically adjusts routing weights to deprioritize that provider — without any change required on your side.What ARouter Monitors
For each provider and model, ARouter continuously tracks:- Success rate: Percentage of requests that complete without error
- Time to first token (TTFT): Latency from request submission to first streaming token
- Total response time: End-to-end latency for non-streaming responses
- Error types: Distinguishes between transient errors (5xx, rate limits) and permanent errors (invalid model, bad request)
Automatic Routing Around Outages
When ARouter detects that a provider is degraded:- The provider’s routing weight is reduced or zeroed temporarily
- Subsequent requests are routed to other healthy providers serving the same model family
- The provider is re-evaluated periodically and reintroduced once health metrics recover
Customizing for Higher Availability
Use Ordered Candidate Model Lists
For critical workloads, specify an ordered list of models. ARouter tries each in sequence until one succeeds:Use Auto Routing
Setmodel: "auto" to let ARouter dynamically select the best available model based on current provider health, cost, and capability:
Use :floor for Cost-Stable Routing
The :floor suffix routes to the lowest-cost provider serving a model, which is often a different provider than the default — providing natural diversity: