Retrieval Augmented Generation (RAG)

Lecture 20

Dr. Benjamin Soltoff

Cornell University
INFO 4940/5940 - Fall 2025

November 6, 2025

Announcements

Announcements

TODO

Learning objectives

TODO

⌨️ 15_coding-assistant

Instructions

  1. Use Claude 3.5 Sonnet to write a function that gets the weather. The first time, use Claude on its own.

  2. Do some basic research for Claude about how to use a specific package to get the weather.

  3. How does Claude do with the same task now?

06:00

Augmented Generation

Retrieval-Augmented Generation (RAG)

Retrieval-Augmented Generation (RAG)

Retrieval-Augmented Generation (RAG)

Retrieval-Augmented Generation (RAG)

Retrieval-Augmented Generation (RAG)

How do we find relevant documents?

Answer: word vector embeddings → turn words into vectors

🤴 - 🧔‍♂️ + 💁‍♀️ = ❓

🤴 - 🧔‍♂️ = 👑
👑 + 💁‍♀️ = ❓

🤴 - 🧔‍♂️ = 👑
👑 + 💁‍♀️ = 👸

OpenAI: text-embedding-3-small

text_embedding_3_small("dplyr::left_join")
#> [-0.0384574,  0.00796838,  0.04896307, ..., -0.01687562, 0.00051399,  0.01020856]
text_embedding_3_small("LEFT JOIN")
#> [-0.0114895,  0.01873610,  0.04436858, ...,  0.0055124, 0.01100459, -0.00588281],
text_embedding_3_small("suitcase")
#> [ 0.01323017, -0.00844115, -0.02530578, ..., -0.00054488, -0.0285338, -0.02933492]

Two ways that users encounter RAG

  1. Every prompt you send gets passed through a RAG system and is augmented

  2. The LLM can decide when to call the RAG system

In R…

In Python…

⌨️ 16_rag

Instructions

Follow the steps in the 16_rag exercise, which are roughly:

  1. First, you’ll create a vector database from a set of documents:

  2. Test out the vector database with a simple query.

  3. Attach a retrieval tool to a chat client and try it in a Shiny app.

15:00

Wrap-up

Recap

TODO