VentureBeat AI

How Shopify built an AI stack that doesn't care which models survive

June 24, 2026•4 min read•

Level:Advanced

For:AI Engineers

✦TL;DR

Shopify has developed an LLM proxy that allows engineers to access multiple AI providers with automatic failover, ensuring uninterrupted workflows even when a model is shut down or updated. The proxy enables access to reporting and failover, and the company has also implemented a distillation strategy, where smaller language models (SLMs) are used to improve performance and reduce costs. In some cases, these SLMs have proven to be 2x cheaper and faster, and up to 30x cheaper and faster in more extreme cases. This approach has significant implications for engineers building AI systems, as it allows for greater flexibility and resilience in the face of changing AI landscapes.

⚡ Key Takeaways

Shopify's LLM proxy provides automatic failover to alternative models, such as Claude Opus or GPT 5.5, in the event of a model shutdown or update.
The company uses distillation to create smaller language models (SLMs) that can be more beneficial than generalized, off-the-shelf models in certain circumstances.
SLMs can be up to 2x cheaper and faster, and in some cases up to 30x cheaper and faster, compared to more generalized models.
Shopify's internal platform, Tangle, allows engineers to visualize the pipeline and deploy fine-tuned models without requiring approval.
The company exposes engineers to different harnesses, such as Claude Code, Codex, and GitHub Copilot, to allow them to choose the best tool for their workflow.

💡 Why It Matters

Shopify's approach to AI development has significant implications for engineers building production AI systems, as it highlights the importance of flexibility and resilience in the face of changing AI landscapes. By using an LLM proxy and distillation strategy, engineers can ensure that their workflows are not disrupted by model shutdowns or updates, and can take advantage of smaller, more special

✅ Practical Steps

Implement an LLM proxy to provide automatic failover to alternative models in the event of a model shutdown or update.
Use distillation to create smaller language models (SLMs) that can be more beneficial than generalized, off-the-shelf models in certain circumstances.
Utilize internal platforms, such as Tangle, to visualize and deploy fine-tuned models without requiring approval.

Want the full story? Read the original article.

Read on VentureBeat AI ↗

How Shopify built an AI stack that doesn't care which models survive

⚡ Key Takeaways

✅ Practical Steps

More like this

Your enterprise AI agents should automatically remember which model is right for which task. Mindstone built the capability with Rebel

The fuel of the future is already here: Why TRISO matters

Huntington Bank: Redacting sensitive data from 400M+ documents with AWS

Why I Stopped Using One Agent and Built a Multi-Agent Pipeline Instead