Agentlify

Build and optimize LLM agents with intelligent model routing

July 2025 ·

Live Site GitHub

50%

Cost Reduction

average overall cost reduction

~5%

Drop in Eval Score

approximate performance drop

Multi-armed Bandit

Routing

RL-based model selection

Problem

Building LLM-powered agents is expensive and complex. Teams often default to the most capable (and costly) model for every step, leading to ballooning costs without proportional quality gains. There's no intelligent way to route different agentic steps to the right model based on complexity and requirements.

Approach

1Context distillation to extract key information from each agentic step
2Skills retrieval into context to improve agent accuracy
3GMM clustering to categorize task complexity
4Multi-armed bandit RL with Bayesian clustering for model selection
5Custom evaluation upload to guide routing decisions

System Pipeline

Context Distillation

→

Skills Retrieval

→

Routing (Bandit RL)

→

Evaluation Upload

Evaluation

Testing showed a 50% average overall cost reduction while agent performance dropped by approximately 5%. The routing system learns to assign simpler steps to cheaper models while reserving powerful models for complex reasoning tasks.

Learnings

Bayesian approaches outperform simple epsilon-greedy strategies for model routing
Context distillation is critical for accurate complexity estimation
Custom eval metrics allow domain-specific optimization beyond generic benchmarks

PythonReactLLMsRLGMMBayesian