Extra credit

Modified

December 4, 2024

Important

Extra credit submissions must be submitted no later than Monday, ~~December 9th~~ December 16th at 11:59pm ET.

Getting started

Go to the info4940-fa24 organization on GitHub. Click on the repo with the prefix ec. It contains the starter documents you need to complete the assignment
Clone the repo and start a new project in RStudio.

Tidy Tuesday

Tidy Tuesday is a weekly data project to promote wrangling and visualization skills. It is hosted by the Data Science Learning Community which aims to “create a supportive and responsive online space for learners” to improve their programming and data analysis skills.

Every week they post a raw dataset on GitHub and ask people to explore the data. The ultimate goal is to apply R skills, get feedback, explore other’s work, and connect with the greater #RStats community. Contributors frequently publish their work on social media under the #TidyTuesday hashtag. Datasets are posted on Mondays.

You are expected to solve a predictive problem using a Tidy Tuesday dataset published during 2024. Your submission should include information on all aspects of the ML workflow, including exploratory analysis, data preprocessing, model selection, and evaluation. You should also include a brief written description of your findings and the rationale behind your choices.

Submission

Once you are finished with the assignment, you will upload you final PDF document to Gradescope. You may only submit one extra credit assignment for the semester. Once it has been evaluated, you may not submit another attempt.

To submit your assignment:

Go to http://www.gradescope.com and click Log in in the top right corner.
Click School Credentials \(\rightarrow\) Cornell University NetID and log in using your NetID credentials.
Click on your INFO 4940 course.
Click on the assignment, and you’ll be prompted to submit it.
Mark every page to be associated with exercise #1. There will be only one exercise listed.

Grading

Students can earn up to a maximum of 1 percentage point towards their final grade. Evaluations are based on the nebulous Difficulty + Execution scoring system.

An image of Olympian Simone Biles performing a round-off. — Inspired by Olympic gymnastics scoring methods.

Component	Points
Difficulty	5
Execution	5

The more challenging the work, the more points you will earn. Likewise, the higher-quality the execution of the ML workflow, the more points you will earn. Partial credit may be awarded.