Retrieval-Augmented Generation (RAG)
How do we find relevant documents?
Answer: word vector embeddings → turn words into vectors
 
🤴 - 🧔♂️ + 💁♀️ = ❓
🤴 - 🧔♂️ = 👑
👑 + 💁♀️ = ❓
🤴 - 🧔♂️ = 👑
👑 + 💁♀️ = 👸
 
OpenAI: text-embedding-3-small
text_embedding_3_small("dplyr::left_join")
#> [-0.0384574,  0.00796838,  0.04896307, ..., -0.01687562, 0.00051399,  0.01020856]
 
text_embedding_3_small("LEFT JOIN")
#> [-0.0114895,  0.01873610,  0.04436858, ...,  0.0055124, 0.01100459, -0.00588281],
 
 
text_embedding_3_small("suitcase")
#> [ 0.01323017, -0.00844115, -0.02530578, ..., -0.00054488, -0.0285338, -0.02933492]