service_tier parameter lets you express a preference for how ARouter and the upstream provider should balance cost and latency for a request.
Usage
| Value | Description |
|---|---|
"auto" (default) | Provider picks the appropriate tier based on availability |
"default" | Standard tier — best cost-performance balance |
"flex" | Reduced cost, best-effort latency — ideal for batch workloads |
Provider Support
Service tier is currently supported by:| Provider | Supported values | Notes |
|---|---|---|
| OpenAI | "auto", "default", "flex" | "flex" offers reduced pricing for batch-like workloads |
| Others | Ignored | Passed through but has no effect |
Response
Theservice_tier used is echoed back in the response:
Use Cases
Batch processing (cost-optimized):Related
- Provider Routing — Fine-grained provider selection and throughput preferences
- Latency and Performance — Best practices for low-latency applications