AE 15: Deploying models to the cloud using Docker

Application exercise

Modified

October 23, 2025

Load the data

library(tidyverse)
library(pins)
library(vetiver)
library(googleCloudStorageR)

housing <- read_csv(file = "data/tompkins-home-sales.csv")
glimpse(housing)

Build a model

Log transform the price variable
Split into training/test set

library(tidymodels)
set.seed(123)

housing_split <- housing |>
  mutate(price = log10(price)) |>
  initial_split(prop = 0.8)

housing_train <- training(housing_split)
housing_test <- testing(housing_split)

Train a random forest model:

rf_rec <- recipe(
  price ~ beds + baths + area + year_built + town,
  data = housing_train
) |>
  step_impute_mean(all_numeric_predictors()) |>
  step_impute_mode(all_nominal_predictors())

housing_fit <- workflow() |>
  add_recipe(rf_rec) |>
  add_model(rand_forest(trees = 200, mode = "regression")) |>
  fit(data = housing_train)

Create a Docker container using a board

Pin model to a board

v <- vetiver_model(model = ______, model_name = ______)
v

board <- ______(versioned = TRUE)

board |>
  ______(v)

Create Docker artifacts

Choosing a port number

Students using Posit Workbench are on a shared server where everyone is building and running containers from the same device. We need to ensure the port number where you are broadcasting the API is unique. Choose a random four-digit number to use as your port number and use it for the rest of the application exercise.

Use port 8080 for the rest of the application exercise.

vetiver_prepare_docker(
  ______,
  ______,
  docker_args = list(port = ____)
)

Build and test Docker container

Run these commands in the Terminal tab of Positron or your local terminal, replacing <NETID> with your actual NetID and <PORT> with your chosen port number:

docker build -t housing-<NETID> .
docker run -p <PORT>:<PORT> housing-<NETID>

Use your own NetID

Students using Posit Workbench are on a shared server where everyone is building and running containers from the same device. You need to ensure your container has a unique name to avoid conflicts with other users.

Run these commands in the Terminal tab of Positron or your local terminal:

docker build -t housing .
docker run -p 8080:8080 housing

Test the API

endpoint <- vetiver_endpoint(url = ______)
predict(endpoint, housing_test)

Compute model metrics and store in pin

housing_test_metrics <- augment(housing_fit, housing_test) |>
  metrics(truth = price, estimate = .pred)

v <- vetiver_model(
  model = housing_fit,
  model_name = "tompkins-housing",
  metadata = ______
)
v

board |> vetiver_pin_write(v)

Retrieve model metrics

extracted_metrics <- board |>
  pin_meta("tompkins-housing") |>
  pluck(______, ______) |>
  as_tibble()

extracted_metrics

What else might you want to store as model metadata? How or when might you use model metadata?

Add response here.

Acknowledgments

Materials derived in part from Intro to MLOps with {vetiver} and licensed under a Creative Commons Attribution 4.0 International (CC BY) License.