Agentlify
Build and optimize LLM agents with intelligent model routing
July 2025 —
50%
Cost Reduction
average overall cost reduction
~5%
Drop in Eval Score
approximate performance drop
Multi-armed Bandit
Routing
RL-based model selection
Problem
Building LLM-powered agents is expensive and complex. Teams often default to the most capable (and costly) model for every step, leading to ballooning costs without proportional quality gains. There's no intelligent way to route different agentic steps to the right model based on complexity and requirements.
Approach
- 1Context distillation to extract key information from each agentic step
- 2Skills retrieval into context to improve agent accuracy
- 3GMM clustering to categorize task complexity
- 4Multi-armed bandit RL with Bayesian clustering for model selection
- 5Custom evaluation upload to guide routing decisions
System Pipeline
Context Distillation
→Skills Retrieval
→Routing (Bandit RL)
→Evaluation Upload
Evaluation
Testing showed a 50% average overall cost reduction while agent performance dropped by approximately 5%. The routing system learns to assign simpler steps to cheaper models while reserving powerful models for complex reasoning tasks.
Learnings
- Bayesian approaches outperform simple epsilon-greedy strategies for model routing
- Context distillation is critical for accurate complexity estimation
- Custom eval metrics allow domain-specific optimization beyond generic benchmarks
PythonReactLLMsRLGMMBayesian