Parameters

ARouter forwards these parameters to upstream providers as-is. Some parameters are provider-specific (such as top_k for non-OpenAI models). Refer to each provider’s documentation to confirm which parameters are supported. Model selection and routing behavior are documented separately in Model Routing and Provider Routing.

Temperature

Key: temperature
Optional, float, 0.0 to 2.0
Default: 1.0

Influences the variety in the model’s responses. Lower values lead to more predictable and typical responses, while higher values encourage more diverse and less common responses. At 0, the model always gives the same response for a given input.

Top P

Key: top_p
Optional, float, 0.0 to 1.0
Default: 1.0

Limits the model’s choices to a percentage of likely tokens: only the top tokens whose probabilities add up to P. A lower value makes the model’s responses more predictable, while the default setting allows for a full range of token choices. Think of it like a dynamic Top-K.

Top K

Key: top_k
Optional, integer, 0 or above
Default: 0

Limits the model’s choice of tokens at each step, making it choose from a smaller set. A value of 1 means the model always picks the most likely next token, leading to predictable results. By default this setting is disabled, making the model consider all choices.

top_k is not available for OpenAI models.

Frequency Penalty

Key: frequency_penalty
Optional, float, -2.0 to 2.0
Default: 0.0

Controls the repetition of tokens based on how often they appear in the input. It tries to use less frequently those tokens that appear more in the input, proportional to how frequently they occur. Token penalty scales with the number of occurrences. Negative values will encourage token reuse.

Presence Penalty

Key: presence_penalty
Optional, float, -2.0 to 2.0
Default: 0.0

Adjusts how often the model repeats specific tokens already used in the input. Higher values make such repetition less likely, while negative values do the opposite. Token penalty does not scale with the number of occurrences. Negative values will encourage token reuse.

Repetition Penalty

Key: repetition_penalty
Optional, float, 0.0 to 2.0
Default: 1.0

Helps to reduce the repetition of tokens from the input. A higher value makes the model less likely to repeat tokens, but too high a value can make the output less coherent. Token penalty scales based on the original token’s probability.

Min P

Key: min_p
Optional, float, 0.0 to 1.0
Default: 0.0

Represents the minimum probability for a token to be considered, relative to the probability of the most likely token. If min_p is set to 0.1, only tokens that are at least 1/10th as probable as the best possible option are considered.

Top A

Key: top_a
Optional, float, 0.0 to 1.0
Default: 0.0

Considers only the top tokens with “sufficiently high” probabilities based on the probability of the most likely token. Think of it like a dynamic Top-P. A lower Top-A value focuses choices based on the highest probability token but with a narrower scope.

Seed

Key: seed
Optional, integer

If specified, the inference will sample deterministically, such that repeated requests with the same seed and parameters should return the same result. Determinism is not guaranteed for all models.

Max Tokens

Key: max_tokens
Optional, integer, 1 or above

Sets the upper limit for the number of tokens the model can generate in response. The maximum value is the context length minus the prompt length.

Logit Bias

Key: logit_bias
Optional, map

Accepts a JSON object that maps token IDs to an associated bias value from -100 to 100. Mathematically, the bias is added to the logits generated by the model prior to sampling. Values between -1 and 1 decrease or increase likelihood of selection; values like -100 or 100 result in a ban or exclusive selection of the relevant token.

Logprobs

Key: logprobs
Optional, boolean

Whether to return log probabilities of the output tokens. If true, returns the log probabilities of each output token returned.

Top Logprobs

Key: top_logprobs
Optional, integer

An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to true if this parameter is used.

Response Format

Key: response_format
Optional, object

Forces the model to produce specific output format. Setting to { "type": "json_object" } enables JSON mode, which guarantees the message the model generates is valid JSON. For strict schema validation, use { "type": "json_schema", "json_schema": { ... } }. When using { "type": "json_object" }, you should still instruct the model to respond with JSON in your prompt. See Structured Outputs for detailed usage and examples.

Stop

Key: stop
Optional, string or array

Stop generation immediately if the model encounters any token specified in the stop array.

Tools

Key: tools
Optional, array

Tool calling parameter, following OpenAI’s tool calling request shape. For providers with non-OpenAI interfaces, ARouter transforms the tools accordingly. See Tool Calling for detailed usage and examples.

Tool Choice

Key: tool_choice
Optional, string or object

Controls which (if any) tool is called by the model:

"none": The model will not call any tool and instead generates a message
"auto": The model can pick between generating a message or calling one or more tools
"required": The model must call one or more tools
{"type": "function", "function": {"name": "my_function"}}: Forces the model to call that specific tool

Parallel Tool Calls

Key: parallel_tool_calls
Optional, boolean
Default: true

Whether to enable parallel function calling during tool use. If true, the model can call multiple functions simultaneously. If false, functions are called sequentially. Only applies when tools are provided.

Prediction

Key: prediction
Optional, object

Reduce latency by providing the model with a predicted output. Useful when you know most of the response content in advance.

{
  "prediction": {
    "type": "content",
    "content": "The predicted content here..."
  }
}

Accepted prediction tokens are reflected in completion_tokens_details.accepted_prediction_tokens in the response usage.

API Reference

OpenAI Compatible

Anthropic Native

Gemini Native

Key Management

Billing

Provider Proxy

Temperature

Top P

Top K

Frequency Penalty

Presence Penalty

Repetition Penalty

Min P

Top A

Seed

Max Tokens

Logit Bias

Logprobs

Top Logprobs

Response Format

Stop

Tools

Tool Choice

Parallel Tool Calls

Prediction

API Reference

OpenAI Compatible

Anthropic Native

Gemini Native

Key Management

Billing

Provider Proxy

​Temperature

​Top P

​Top K

​Frequency Penalty

​Presence Penalty

​Repetition Penalty

​Min P

​Top A

​Seed

​Max Tokens

​Logit Bias

​Logprobs

​Top Logprobs

​Response Format

​Stop

​Tools

​Tool Choice

​Parallel Tool Calls

​Prediction

Temperature

Top P

Top K

Frequency Penalty

Presence Penalty

Repetition Penalty

Min P

Top A

Seed

Max Tokens

Logit Bias

Logprobs

Top Logprobs

Response Format

Stop

Tools

Tool Choice

Parallel Tool Calls

Prediction