[Feature Request] Multi-tier Dynamic Model Routing Based on Token Thresholds

### Feature request

I would like to propose adding a native, multi-tier dynamic model routing feature to PR-Agent based on the token size of the pull request.
Currently, PR-Agent routes all requests to a single primary model. The proposed solution would allow users to define a list of routing rules (thresholds) in the configuration.toml file. When PR-Agent calculates the PR diff tokens, it should evaluate these rules to select the most appropriate model before sending the prompt.
Example configuration approach:

```
[config]
# Default/Fallback heavy model for the largest PRs
model = "anthropic/claude-3.5-sonnet"

enable_dynamic_routing = true

# A list of thresholds to route smaller PRs to cheaper/faster models
# Evaluated in ascending order of max_tokens
[[config.routing_rules]]
max_tokens = 1000
model = "gemini/gemini-2.5-flash"

[[config.routing_rules]]
max_tokens = 5000
model = "openai/gpt-4o-mini"
```

Workflow logic: If the calculated tokens are `<= 1000`, use `gemini-2.5-flash`. If it's `> 1000` but `<= 5000`, use `gpt-4o-mini`. If it exceeds `5000`, fall back to the main `model` (`claude-3.5-sonnet`).

### Motivation

- The Cost Problem: Using heavy models (GPT-4o, Claude 3.5) for trivial 10-line PRs wastes API credits, while cheap models fail on complex architectural changes.
- The Overhead: The current workaround (deploying a standalone LiteLLM Proxy) introduces unnecessary infrastructure complexity for self-hosted users.
- The Solution: Native threshold-based routing optimizes API budgets by automatically handling trivial updates with cost-effective models, reserving expensive tokens only for deep reasoning.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] Multi-tier Dynamic Model Routing Based on Token Thresholds #2434

Feature request

Motivation

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Feature Request] Multi-tier Dynamic Model Routing Based on Token Thresholds #2434

Description

Feature request

Motivation

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions