40 curated questions on LLMs, RAG, agents, fine-tuning, embeddings, and system design — with detailed answers.
What is the difference between RAG and fine-tuning? When would you choose one over the other?
Explain the difference between dense retrieval and sparse retrieval. What are BM25 and bi-encoder models?
What is chunking in RAG and what are the common strategies?
What is a reranker and why is it useful in RAG pipelines?
What is attention and how does it work in a transformer?
What is KV cache and why is it important for LLM inference?
What is the difference between temperature, top-p, and top-k sampling?
What is positional encoding and why is it needed in transformers?
What is LoRA and how does it reduce the cost of fine-tuning?
What is catastrophic forgetting and how do you mitigate it during fine-tuning?
What is RLHF and what are its components?
What are embeddings and why are they central to semantic search?
What is the difference between cosine similarity and dot product for comparing embeddings?
What is an AI agent and how is it different from a standard LLM call?
What is the ReAct pattern in LLM agents?
EXPLORE AI NEWS
Daily hand-picked stories on LLMs, RAG, agents and production AI — curated for engineers who ship.
GET THE WEEKLY DIGEST
Join engineers getting the Monday signal-over-noise AI breakdown. No spam, unsubscribe anytime.
LEARN AI ENGINEERING
Curated courses, research papers, repos and tutorials built for engineers leveling up in AI.