AE 15: Deploying models to the cloud using Docker

R

Application exercise
R
Modified

October 23, 2025

Load the data

library(tidyverse)
library(pins)
library(vetiver)
library(googleCloudStorageR)

housing <- read_csv(file = "data/tompkins-home-sales.csv")
glimpse(housing)

Build a model

  • Log transform the price variable
  • Split into training/test set
library(tidymodels)
set.seed(123)

housing_split <- housing |>
  mutate(price = log10(price)) |>
  initial_split(prop = 0.8)

housing_train <- training(housing_split)
housing_test <- testing(housing_split)

Train a random forest model:

rf_rec <- recipe(
  price ~ beds + baths + area + year_built + town,
  data = housing_train
) |>
  step_impute_mean(all_numeric_predictors()) |>
  step_impute_mode(all_nominal_predictors())

housing_fit <- workflow() |>
  add_recipe(rf_rec) |>
  add_model(rand_forest(trees = 200, mode = "regression")) |>
  fit(data = housing_train)

Create a Docker container using a board

Pin model to a board

v <- vetiver_model(model = ______, model_name = ______)
v

board <- ______(versioned = TRUE)

board |>
  ______(v)

Create Docker artifacts

Choosing a port number

Students using Posit Workbench are on a shared server where everyone is building and running containers from the same device. We need to ensure the port number where you are broadcasting the API is unique. Choose a random four-digit number to use as your port number and use it for the rest of the application exercise.

Use port 8080 for the rest of the application exercise.

vetiver_prepare_docker(
  ______,
  ______,
  docker_args = list(port = ____)
)

Build and test Docker container

Run these commands in the Terminal tab of Positron or your local terminal, replacing <NETID> with your actual NetID and <PORT> with your chosen port number:

docker build -t housing-<NETID> .
docker run -p <PORT>:<PORT> housing-<NETID>
Use your own NetID

Students using Posit Workbench are on a shared server where everyone is building and running containers from the same device. You need to ensure your container has a unique name to avoid conflicts with other users.

Run these commands in the Terminal tab of Positron or your local terminal:

docker build -t housing .
docker run -p 8080:8080 housing

Test the API

endpoint <- vetiver_endpoint(url = ______)
predict(endpoint, housing_test)

Compute model metrics and store in pin

housing_test_metrics <- augment(housing_fit, housing_test) |>
  metrics(truth = price, estimate = .pred)

v <- vetiver_model(
  model = housing_fit,
  model_name = "tompkins-housing",
  metadata = ______
)
v

board |> vetiver_pin_write(v)

Retrieve model metrics

extracted_metrics <- board |>
  pin_meta("tompkins-housing") |>
  pluck(______, ______) |>
  as_tibble()

extracted_metrics

What else might you want to store as model metadata? How or when might you use model metadata?

Add response here.

Acknowledgments