AE 19: Version your housing model

Application exercise
Modified

November 14, 2024

Load the data

library(tidyverse)

housing <- read_csv(file = "data/tompkins-home-sales-geocoded.csv")
glimpse(housing)

Build a model

  • Log transform the price variable
  • Split into training/test set
library(tidymodels)

set.seed(123)
housing_split <- housing |>
  mutate(price = log10(price)) |>
  initial_split(prop = 0.8)

housing_train <- training(housing_split)
housing_test <- testing(housing_split)

Train a linear regression model:

housing_fit <- workflow() |>
  add_formula(price ~ ______) |>
  add_model(linear_reg()) |>
  fit(data = housing_train)

Create a deployable model object

library(vetiver)
v <- vetiver_model(
  model = ______,
  model_name = ______
)
v
# create a vetiver model with a custom description

Pin your model

library(pins)

board <- ______
board |> ______(v)
# retrieve your model metadata
board |> pin_meta(______)

Store a new version

Train your model with a new algorithm:

housing_fit <- workflow() |>
  add_formula(price ~ ______) |>
  add_model(linear_reg()) |>
  fit(data = housing_train)

Store this new model as a new version of the same pin:

v <- vetiver_model(model = ______, model_name = ______, versioned = TRUE)
board |> ______(v)

What versions do you have?

board |> pin_versions(______)

Create a new {vetiver} model

Fit a random forest model

rf_rec <- recipe(price ~ beds + baths + area + year_built + town, data = housing_train) |>
  step_impute_mean(all_numeric_predictors()) |>
  step_impute_mode(all_nominal_predictors())

housing_fit <- workflow() |>
  add_recipe(rf_rec) |>
  add_model(rand_forest(trees = 200, mode = "regression")) |>
  fit(data = housing_train)

Store your model:

v <- vetiver_model(housing_fit, model_name = "random forest", versioned = TRUE)
board |> vetiver_pin_write(v)

Model Card

Open the Model Card template in RStudio by choosing “File” ➡️ “New File” ➡️ “R Markdown” ➡️ “From Template” ➡️ “Vetiver Model Card”.

Create a vetiver REST API

library(plumber)

pr() |>
  ______(v) |>
  pr_run()

Call your new API endpoints

Run your API in the background

Create a new plumber API R script:

vetiver_write_plumber(board = ______, name = ______)

Then run plumber-run.R as a background job. This will allow you to run the API locally and still access the console.

Return predictions from your model API:

url <- ______
endpoint <- ______(url)
predict(endpoint, slice_sample(housing_test, n = 10))

Optional: try /metadata or /ping here:

library(httr2)

url <- ______

request(url) |>
  req_perform() |>
  resp_body_json()

Acknowledgments