Context Engineering for RAG : The Four Typed Inputs Behind Every RAG Answer
The practice of Context Engineering for Retrieval-Augmented Generation (RAG) involves four typed inputs that converge on a single Large Language Model (LLM) call. This approach, named by Tobi Lütke and Andrej Karpathy in 2025, enables effective document intelligence. For a single document, each component emits typed pieces that are used to generate a response. The practical implication for engineers building AI systems is the ability to create more accurate and informative outputs by leveraging these typed inputs.
⚡ Key Takeaways
- The four typed inputs are used behind every RAG answer, although their specific types are not mentioned.
- Context Engineering is a practice that involves emitting typed pieces from each component for a single document.
- The approach converges on one LLM call, highlighting the importance of LLMs in RAG.
- Corpus, conversation, and tool extensions are potential follow-up work for Context Engineering.
The Context Engineering practice has a significant impact on engineers shipping production AI today, as it enables the creation of more accurate and informative outputs. By understanding the four typed inputs behind every RAG answer, engineers can design more effective RAG systems.
✅ Practical Steps
- Apply the concepts from this article to your own system design, considering the role of typed inputs in RAG.
Want the full story? Read the original article.
Read on Towards Data Science ↗