The Pulse: a new trend, smart model routing
A new trend in AI engineering is smart model routing, where an "intelligent" router picks the right model for the right task to reduce spending on AI. Vendors such as Factory Router, Not Diamond, and Vercel AI gateway offer solutions that claim cost savings of 20-30%. These solutions automatically select the best model for a given task, considering factors such as cost, latency, and availability. The practical implication for engineers building AI systems is that they can optimize their AI infrastructure costs by leveraging these smart routing solutions.
⚡ Key Takeaways
- Factory Router claims 20-25% cost savings by automatically selecting the right model per session.
- Not Diamond offers auto-selection of coding models, claiming around 30% cost savings.
- Vercel AI gateway provides smart routing and billing for hundreds of AI models in one place.
- OpenRouter uses Not Diamond under the hood for auto-routing functionality.
- Requestly.ai automatically routes requests to the right model based on cost, latency, and availability.
The trend of smart model routing has significant implications for engineers building AI systems, as it can help reduce infrastructure costs and optimize AI model usage. By leveraging these solutions, engineers can focus on developing more efficient and cost-effective AI systems.
✅ Practical Steps
- Evaluate the cost savings potential of smart model routing solutions such as Factory Router, Not Diamond, and Vercel AI gateway.
- Consider integrating OpenRouter or Requestly.ai into your AI infrastructure to leverage auto-routing functionality.
- Explore the routing configuration options offered by Envoy AI Gateway and LiteLLM to optimize model selection.
Want the full story? Read the original article.
Read on Pragmatic Engineer ↗