Retrieval Augmented Generation (RAG)

Lecture 20

Dr. Benjamin Soltoff

Cornell University
INFO 4940/5940 - Fall 2025

November 6, 2025

Announcements

Project 01 due today
Confidential peer feedback surveys open tomorrow at noon on Canvas

Learning objectives

Define Retrieval-Augmented Generation (RAG)
Explain how RAG improves LLM outputs
Implement a simple RAG system in R or Python

Application exercise

`ae-19`

Instructions

Go to the course GitHub org and find your ae-19 (repo name will be suffixed with your GitHub name).
Clone the repo in Positron, install required packages using renv::restore() (R) or uv sync (Python), open the Quarto document in the repo, and follow along and complete the exercises.
Render, commit, and push your edits by the AE deadline – end of the day

⌨️ `15_coding-assistant`

Instructions

Use Claude 3.7 Sonnet to write a function that gets the weather. The first time, use Claude on its own.
Do some basic research for Claude about how to use a specific package to get the weather.
How does Claude do with the same task now?

06:00

Augmented Generation

Retrieval-Augmented Generation (RAG)

How do we find relevant documents?

Answer: word vector embeddings → turn words into vectors

🤴 - 🧔‍♂️ + 💁‍♀️ = ❓

🤴 - 🧔‍♂️ = 👑
👑 + 💁‍♀️ = ❓

🤴 - 🧔‍♂️ = 👑
👑 + 💁‍♀️ = 👸

OpenAI: text-embedding-3-small

text_embedding_3_small("dplyr::left_join")
#> [-0.0384574,  0.00796838,  0.04896307, ..., -0.01687562, 0.00051399,  0.01020856]

text_embedding_3_small("LEFT JOIN")
#> [-0.0114895,  0.01873610,  0.04436858, ...,  0.0055124, 0.01100459, -0.00588281],

text_embedding_3_small("suitcase")
#> [ 0.01323017, -0.00844115, -0.02530578, ..., -0.00054488, -0.0285338, -0.02933492]

Two ways that users encounter RAG

Every prompt you send gets passed through a RAG system and is augmented
The LLM can decide when to call the RAG system

In R…

In Python…

⌨️ `16_rag`

Instructions

Follow the steps in the 16_rag exercise, which are roughly:

First, you’ll create a vector database from a set of documents:
- In R: R for Data Science (R4DS)
- In Python: The Polars Cookbook
Test out the vector database with a simple query.
Attach a retrieval tool to a chat client and try it in a Shiny app.

15:00

Wrap-up

Recap

RAG improves LLM outputs by adding relevant context
RAG systems use vector embeddings to find relevant documents
You can implement RAG in R with ragnar or in Python with LlamaIndex

Acknowledgments

Materials derived in part from Programming with LLMs and licensed under a Creative Commons Attribution 4.0 International (CC BY) License.

Retrieval Augmented Generation (RAG)

Announcements

Announcements

Learning objectives

Application exercise

ae-19

⌨️ 15_coding-assistant

Augmented Generation

Retrieval-Augmented Generation (RAG)

Retrieval-Augmented Generation (RAG)

Retrieval-Augmented Generation (RAG)

Retrieval-Augmented Generation (RAG)

Retrieval-Augmented Generation (RAG)

How do we find relevant documents?

OpenAI: text-embedding-3-small

Two ways that users encounter RAG

In R…

In Python…

⌨️ 16_rag

Wrap-up

Recap

Acknowledgments

`ae-19`

⌨️ `15_coding-assistant`

⌨️ `16_rag`