Wrap-up: Where to go from here

Lecture 23

Dr. Benjamin Soltoff

Cornell University
INFO 4940/5940 - Fall 2025

December 4, 2025

End-of-semester logistics

Remaining assignments

  • Project 02 due December 18th at noon
  • Complete confidential peer feedback survey by December 20th at 11:59pm

Build a simple data science stack

Posit Workbench

  • Access to Posit Workbench will end at some point after December 26th
  • All INFO 4940/5940 materials remain available in your repos on GitHub as long as you are an active student
  • Any other work you have done on the server will not be accessible after the end of the semester
  • Where will you go from here?

Software installation

Programming language
Reproducibility

Maintaining reproducible computing environments

renv::snapshot(type = "all")  # save the state of your R packages
renv::restore()               # restore the state of your R packages
uv add <package>              # add a package to the environment
uv sync                       # synchronize the environment

Do I need reproducible environments?

  • For small, solo projects, maybe not
  • For larger, collaborative projects, probably yes

Create a GitHub account

Configure Git

usethis::use_git_config(
  user.name = "Your name", 
  user.email = "Email associated with your GitHub account"
  )

Painless authentication with PAT

Personal Access Token

Setup PAT authentication

Create PAT

usethis::create_github_token(
  scopes = c("repo", "user", "gist", "workflow"),
  description = "<DESCRIBE YOUR DEVICE>"
)

Store PAT

gitcreds::gitcreds_set()

#> ? Enter password or token: ghp_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
#> -> Adding new credentials...
#> -> Removing credentials from cache...
#> -> Done.

Create PAT

usethis::create_github_token(
  scopes = c("repo", "user", "gist", "workflow"),
  description = "<DESCRIBE YOUR DEVICE>",
  host = "https://github.coecis.cornell.edu/"
)

Store PAT

gitcreds::gitcreds_set(url = "https://github.coecis.cornell.edu/")

#> ? Enter password or token: ghp_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
#> -> Adding new credentials...
#> -> Removing credentials from cache...
#> -> Done.

What have you learned?

Learning objectives for INFO 4940/5940

  • Train and evaluate predictive machine learning models
  • Collect and wrangle data for machine learning
  • Deploy machine learning models in a production environment
  • Utilize large language models (LLMs) to solve contemporary problems
  • Implement reproducible machine learning workflows using version control and literate programming

Where to go from here

Courses

Foundational (much more theory and math)
  • CS 3780: Machine Learning for Intelligent Systems
  • ECE 3200: Fundamentals of Machine Learning
  • ORIE 3741: Learning with Big Messy Data
  • STSCI 3740: Data Mining and Machine Learning
  • INFO 3950: Data Analytics for Information Science
Domain applications
  • INFO 1340: Books as Data
  • INFO 3350/6350: Text Mining History and Literature
  • INFO 4100/5101: Learning Analytics
  • INFO 4300: Language and Information
  • INFO 4940: Advanced NLP for Humanities Research
  • INFO 4940/6940: How LLMs work, their potential and limitations
  • INFO 5940: AI Chatbots, RAG, AI Agents
  • INFO 5940: Solving challenges at the enterprise with AI

Find a community

Online communities

Keep your skills fresh

Project 02 peer reviews

Project 02 peer reviews

Instructions

  • Go to the project 02 peer review instructions and find your assigned teams.

  • Open the repos on GitHub and find the information on their projects and how to access the draft product

    • docs/report.pdf
    • README.md
  • Open a GitHub issue in each repo using the peer review template, test the product, and complete the issue.

  • Your team will be evaluated on the quality of feedback you provide to your peers.

Good luck on your projects!