AE 21: Monitor models
Application exercise
Load the data
housing_train <- read_csv("data/housing_train.csv")
housing_val <- read_csv("data/housing_val.csv")
housing_monitor <- read_csv("data/housing_monitor.csv")
housing <- bind_rows(
housing_train |> mutate(monitor = "Training/testing"),
housing_val |> mutate(monitor = "Training/testing"),
housing_monitor |> mutate(monitor = "Monitoring")
) |>
mutate(monitor = fct(monitor))
url <- "http://appliedml.infosci.cornell.edu:2300/predict"
endpoint <- vetiver_endpoint(url)
Monitor your model’s inputs
Your turn: Create a plot or table comparing the development vs. monitoring distributions of a model input/feature. How might you make this comparison if you didn’t have all the model development data available when monitoring? What summary statistics might you record during model development, to prepare for monitoring?
# add code here
Monitor your model’s outputs
Your turn: Use the functions for metrics monitoring from {vetiver} to create a monitoring visualization. Choose a different set of metrics or time aggregation.
# add code here
Compute ML metrics
Your turn: Compute the mean absolute percentage error with the monitoring data, and aggregate by week/month, number of bedrooms/bathrooms, or town location.
If you have time, make a visualization showing your results.
# add code here
Acknowledgments
- Materials derived in part from Intro to MLOps with {vetiver} and licensed under a Creative Commons Attribution 4.0 International (CC BY) License.