Lecture 1
Cornell University
INFO 4940/5940 - Fall 2024
August 27, 2024
Dr. Benjamin Soltoff
Lecturer in Information Science
Gates Hall 216
02:00
Illustration credit: https://vas3k.com/blog/machine_learning/
Illustration credit: https://vas3k.com/blog/machine_learning/
Illustration credit: https://vas3k.com/blog/machine_learning/
How are statistics and machine learning related?
How are they similar? Different?
03:00
Illustration credit: workshops.tidymodels.org
Illustration credit: Posit
R | Python | |
---|---|---|
Syntax | Functional language | Object-oriented language |
Statistical learning | Developed by statisticians for statistical analysis | Meh |
Machine learning | ||
Deep learning |
|
|
Visualization | {ggplot2} | {matplotlib} + others |
Package management | CRAN | pip/virtualenv/PyPI/Anaconda |
Speed | Somewhat slower | Somewhat faster |
Community | Academia and industry | Larger (general-purpose programming language) |
Generated by DALL·E
Generated by DALL·E
Generated by DALL·E
Generated by DALL·E
https://info4940.infosci.cornell.edu/
All linked from the course website:
GitHub organization: github.coecis.cornell.edu/info4940-fa24
RStudio
Use the Workbench: rstudio-workbench.infosci.cornell.edu
Communication: GitHub Discussions
Assignment submission and feedback: Gradescope
Important
Make sure you can access RStudio before class on Thursday.
Prepare: Introduce new content and prepare for class by completing the readings
Participate: Attend and actively participate in class, office hours, and team meetings
Practice: Practice applying ML techniques and computing with application exercises during class, graded for completion
Perform: Put together what you’ve learned to analyze real-world data
Category | Percentage |
---|---|
Homework | 50% |
Project | 40% |
Application Exercises | 10% |
See course syllabus for how the final letter grade will be determined.
I want this course to be accessible to students with all abilities. Please feel free to let me know if there are circumstances affecting your ability to participate in class.
Only work that is clearly assigned as team work should be completed collaboratively.
Homeworks must be completed individually. You may not directly share answers / code with others, however you are welcome to discuss the problems in general and ask for advice.
We are aware that a huge volume of code is available on the web, and many tasks may have solutions posted
Any recycled code that is discovered and is not explicitly cited will be treated as plagiarism, regardless of source
All code must be written by you, the human being
Use generative AI to facilitate, rather than hinder, learning
✅ GAI tools for reference purposes
🤔 GAI tools for writing my code/analysis
You may use GAI tools to assist in writing code in this class
You may not make use of the technology as a substitute for critical thinking
I reserve the right to orally assess any student on their submissions to verify they meet the learning objectives for the assignment
❌ GAI tools for narrative
You are ultimately responsible for the work you turn in; it should reflect your understanding of the course content
Source: Code of Academic Integrity
Ask if you’re not sure if something violates a policy!
Discuss with your peers, then submit your individual responses.
08:00