Skip to content

[Feature Request] Multi-tier Dynamic Model Routing Based on Token Thresholds #2434

@arsalanyavari

Description

@arsalanyavari

Feature request

I would like to propose adding a native, multi-tier dynamic model routing feature to PR-Agent based on the token size of the pull request.
Currently, PR-Agent routes all requests to a single primary model. The proposed solution would allow users to define a list of routing rules (thresholds) in the configuration.toml file. When PR-Agent calculates the PR diff tokens, it should evaluate these rules to select the most appropriate model before sending the prompt.
Example configuration approach:

[config]
# Default/Fallback heavy model for the largest PRs
model = "anthropic/claude-3.5-sonnet"

enable_dynamic_routing = true

# A list of thresholds to route smaller PRs to cheaper/faster models
# Evaluated in ascending order of max_tokens
[[config.routing_rules]]
max_tokens = 1000
model = "gemini/gemini-2.5-flash"

[[config.routing_rules]]
max_tokens = 5000
model = "openai/gpt-4o-mini"

Workflow logic: If the calculated tokens are <= 1000, use gemini-2.5-flash. If it's > 1000 but <= 5000, use gpt-4o-mini. If it exceeds 5000, fall back to the main model (claude-3.5-sonnet).

Motivation

  • The Cost Problem: Using heavy models (GPT-4o, Claude 3.5) for trivial 10-line PRs wastes API credits, while cheap models fail on complex architectural changes.
  • The Overhead: The current workaround (deploying a standalone LiteLLM Proxy) introduces unnecessary infrastructure complexity for self-hosted users.
  • The Solution: Native threshold-based routing optimizes API budgets by automatically handling trivial updates with cost-effective models, reserving expensive tokens only for deep reasoning.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions