An introduction to LLMs

Lecture 16

Dr. Benjamin Soltoff

Cornell University
INFO 4940/5940 - Fall 2024

October 29, 2024

Announcements

Project proposals
Homework 03

Learning objectives

Define a language model
Describe how large language models are created
Review the most popular LLMs
Interact with an LLM via API
Utilize prompt engineering to improve the quality of responses

Language models

Language model

A language model estimates the probability of a token or sequence of tokens occurring within a longer sequence of tokens.

When I hear rain on my roof, I _______ in my kitchen.

Probability	Token(s)
9.4%	cook soup
5.2%	warm up a kettle
3.6%	cower
2.5%	nap
2.2%	relax

Context matters

Context is helpful information before or after the target token.

\(n\)-grams
Recurrent neural networks (RNNs)
Transformer

Large language models (LLMs)

What’s a transformer?

A transformer is a deep learning model that uses attention to weigh the influence of different parts of the input sequence on each other.

Encoder
Decoder

Self-attention

Self-attention is a mechanism that allows each position in the input sequence to attend to all positions in the input sequence.

How much does each other token of input affect the interpretation of this token?

Self-attention is learned through the training of the encoder and decoder. These models typically contain hundreds of billions or trillions of parameters (weights).

Generating output

LLMs are functionally similar to auto-complete mechanisms.

Given the current token, what is the next most likely token?

My dog, Max, knows how to perform many traditional dog tricks.
___ (masked sentence)

Probability	Word(s)
3.1%	For example, he can sit, stay, and roll over.
2.9%	For example, he knows how to sit, stay, and roll over.

Generating output

Sufficiently large LLMs can generate probabilities for paragraphs/essays/code blocks.
Responses are probabilistic - the model generates a distribution of possible outputs and probabilistically selects the most likely one.
Inherently an element of randomness to this probabilistic draw, hence LLMs do not always generate the same output for the same input.
Size of context window matters

Types of outputs LLMs can generate

Human language (i.e. text)
Code
Images
Audio
Video

Challenges with LLMs

Training

Requires an enormous volume of data
Extremely time-intensive
Consumes enormous computational resources and electricity

Inference/prediction

Hallucinations
Biases
Unethical usage

Foundational LLMs

LLMs trained on enough inputs to generate a wide range of outputs across many domains.

Aka base LLMs or pre-trained LLMs.

Examples of foundational LLMs

LLM	Developer	Inputs	Outputs	Access
GPT	OpenAI	Text, image, data	Text	Proprietary
DALL·E	OpenAI	Text	Image	Proprietary
Gemini	Google	Text, image, audio, video	Text, image	Proprietary
Gemma	Google	Text	Text	Open
Llama	Meta	Text	Text	Open
Claude	Anthropic	Text, audio, image, data	Text, computer control	Proprietary
Ministral	Mistral	Text, image	Text	Proprietary/open
Phi	Microsoft	Text	Text	Open
BERT	Google	Text	Text	Open

Accessing foundational LLMs

Application programming interface!

Proprietary LLMs - in the cloud
Open LLMs - install and run locally
- Hugging Face
- Ollama

Application exercise

`ae-15`

Go to the course GitHub org and find your ae-15 (repo name will be suffixed with your GitHub name).
Clone the repo in RStudio, run renv::restore() to install the required packages, open the Quarto document in the repo, and follow along and complete the exercises.
Render, commit, and push your edits by the AE deadline – end of the day

Improving on foundational LLMs

Methods to improve LLM performance

Prompt engineering
Fine-tuning

Building prompts

User messages
System messages
Assistant messages

Personas in system prompts

You are a helpful assistant.

You are a curious student.

You are a 20 year old Gen Z assistant.

You are an AI assistant specialized in helping users with Shiny for R. Your tasks include explaining concepts in Shiny, explaining how to do things with Shiny, or creating a complete, functional Shiny for R app code as an artifact based on the user’s description. Only answer questions related to Shiny, or R or Python. Don’t answer any questions related to anything else.

If the user asks for explanations about concepts or code in Shiny for R, then you should provide detailed and accurate information about the topic. This may include descriptions, examples, use cases, and best practices related to Shiny for R. If your answer includes examples of Shiny apps, you should provide the code of each one within and tags, and otherwise adhere to the guidelines below for creating applications.

If the user asks for an application, you should provide a Shiny for R app code that meets the requirements specified in the user prompt. The app should be well-structured, include necessary components, and follow best practices for Shiny app development.

Prompt engineering

Write clear instructions
Provide reference text
Split complex tasks into simpler subtasks
Give the model time to “think”
Use external tools
Test changes systematically

⏱️ Your turn

Write a system prompt for an R tutor chatbot. The chatbot will be deployed for INFO 2950 or INFO 5001 to assistant students in meeting the learning objectives for the courses. It should behave similar to a human TA in that it supports students without providing direct answers to assignments or exams.

Test it on the provided user prompts.

10:00

Wrap-up

Recap

Language models and LLMs are extremely complicated models that can generate text, code, images, audio, and video.
Foundational LLMs are pre-trained models that can be used to generate a wide range of outputs.
They do not engage in reasoning like humans and are not capable of understanding things the way a human would.
Improving LLM performance can be done through prompt engineering and fine-tuning.

An introduction to LLMs

Announcements

Announcements

Learning objectives

Language models

Language model

Context matters

Large language models (LLMs)

What’s a transformer?

Self-attention

Generating output

Generating output

Types of outputs LLMs can generate

Challenges with LLMs

Training

Inference/prediction

Foundational LLMs

Foundational LLMs

Examples of foundational LLMs

Accessing foundational LLMs

Application exercise

ae-15

Improving on foundational LLMs

Methods to improve LLM performance

Building prompts

Personas in system prompts

Prompt engineering

⏱️ Your turn

Wrap-up

Recap

It’s Halloween! 🎃

`ae-15`