AE 21: Monitor models

Application exercise
Modified

December 3, 2024

Load the data

housing_train <- read_csv("data/housing_train.csv")
housing_val <- read_csv("data/housing_val.csv")
housing_monitor <- read_csv("data/housing_monitor.csv")

housing <- bind_rows(
  housing_train |> mutate(monitor = "Training/testing"),
  housing_val |> mutate(monitor = "Training/testing"),
  housing_monitor |> mutate(monitor = "Monitoring")
) |>
  mutate(monitor = fct(monitor))

url <- "http://appliedml.infosci.cornell.edu:2300/predict"
endpoint <- vetiver_endpoint(url)

Monitor your model’s inputs

Your turn: Create a plot or table comparing the development vs. monitoring distributions of a model input/feature. How might you make this comparison if you didn’t have all the model development data available when monitoring? What summary statistics might you record during model development, to prepare for monitoring?

# add code here

Monitor your model’s outputs

Your turn: Use the functions for metrics monitoring from {vetiver} to create a monitoring visualization. Choose a different set of metrics or time aggregation.

# add code here

Compute ML metrics

Your turn: Compute the mean absolute percentage error with the monitoring data, and aggregate by week/month, number of bedrooms/bathrooms, or town location.

If you have time, make a visualization showing your results.

# add code here

Acknowledgments