Agentlify

Build and optimize LLM agents with intelligent model routing

July 2025
50%
Cost Reduction
average overall cost reduction
~5%
Drop in Eval Score
approximate performance drop
Multi-armed Bandit
Routing
RL-based model selection

Problem

Building LLM-powered agents is expensive and complex. Teams often default to the most capable (and costly) model for every step, leading to ballooning costs without proportional quality gains. There's no intelligent way to route different agentic steps to the right model based on complexity and requirements.

Approach

  1. 1Context distillation to extract key information from each agentic step
  2. 2Skills retrieval into context to improve agent accuracy
  3. 3GMM clustering to categorize task complexity
  4. 4Multi-armed bandit RL with Bayesian clustering for model selection
  5. 5Custom evaluation upload to guide routing decisions

System Pipeline

Context Distillation
Skills Retrieval
Routing (Bandit RL)
Evaluation Upload

Evaluation

Testing showed a 50% average overall cost reduction while agent performance dropped by approximately 5%. The routing system learns to assign simpler steps to cheaper models while reserving powerful models for complex reasoning tasks.

Learnings

  • Bayesian approaches outperform simple epsilon-greedy strategies for model routing
  • Context distillation is critical for accurate complexity estimation
  • Custom eval metrics allow domain-specific optimization beyond generic benchmarks
PythonReactLLMsRLGMMBayesian