Ground models in external knowledge

Retrieval-Augmented Generation

RAG combines a generator with a retriever so an AI system can look up relevant information instead of relying only on weights learned during training.

RAG system loop

User question

Embed query

Retrieve documents

Rerank evidence

Answer with citations

Professional outcome

What you should be able to do

Design a production retrieval pipeline, choose chunking and ranking strategies, cite sources, and debug hallucinations caused by poor retrieval.

CapstoneBuild a cited technical-support assistant over a document library.

Essentials

Concepts to master

Embeddings and vector search
Chunking and metadata
Hybrid retrieval and reranking
Grounded generation and citation
Evaluation for answer faithfulness

Create a small docs folder in VS Code and write an ingestion script that loads each file with its URL and title.
Split the documents into chunks, attach metadata, and store embeddings in a local or managed vector index.
Build a query route that retrieves evidence, assembles a citation prompt, and refuses when evidence is missing.
Write retrieval and answer-evaluation cases before changing chunk size, embedding model, reranker, or prompt rules.

Primary sources

Start from authoritative material.

Lewis et al., Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks LlamaIndex introduction to RAG LangChain retrieval documentation

Back to roadmap Open first source

Retrieval-Augmented Generation

What you should be able to do

Concepts to master

How to turn this topic into a working project.

Start from authoritative material.