Versioning and deploying models

Lecture 20

Dr. Benjamin Soltoff

Cornell University
INFO 4940/5940 - Fall 2024

November 14, 2024



Learning objectives

  • Create bundled model objects that can be saved to disk
  • Implement versioning for model objects
  • Review application programming interfaces
  • Generate a REST API for a model using {vetiver} and {plumber}

MLOps with {vetiver}

Vetiver, the oil of tranquility, is used as a stabilizing ingredient in perfumery to preserve more volatile fragrances.

If you develop a model…

you can operationalize that model!

If you develop a model…

you likely should be the one to operationalize that model!

Tompkins County housing data

  • Home sale prices for Tompkins County, NY between 2022-23
  • Can certain measurements be used to predict the sale price?
  • Data collected from Redfin

  • N = 1,270
  • A numeric outcome, price
  • Other variables to use for prediction:
    • beds, baths, area, and year_built are numeric predictors
    • town and municipality could be nominal predictors
    • sold_date could be a date predictor

Home prices in Tompkins County, NY

sold_date price beds baths area lot_size year_built hoa_month town municipality long lat
2022-06-03 33000 2 1.0 980 NA 1979 NA Ithaca Ithaca city -76.51334 42.43245
2022-08-28 270000 3 2.0 1420 0.6030073 1955 NA Ithaca Unincorporated -76.45334 42.41719
2022-08-24 500000 3 1.5 2742 0.1399908 1900 NA Ulysses Trumansburg village -76.65999 42.54152
2022-09-20 400000 5 2.0 2066 0.3856749 1965 NA Ithaca Unincorporated -76.46803 42.47407
2022-08-08 469000 3 3.0 3015 0.7100092 1932 NA Ithaca Unincorporated -76.49558 42.45912
2023-04-19 205000 3 2.0 1200 0.8000000 1996 NA Ulysses Unincorporated -76.66433 42.49964
2023-03-22 350000 5 3.0 3080 2.5000000 1830 NA Enfield Unincorporated -76.59211 42.42251
2023-06-16 499000 3 2.5 2008 1.2000000 1935 NA Ithaca Unincorporated -76.46770 42.46063
2023-07-27 390000 4 3.0 2513 0.5699954 1987 NA Ithaca Unincorporated -76.48377 42.42092
2022-12-19 375000 3 2.5 1976 1.0000000 2004 NA Dryden Unincorporated -76.41183 42.43847

Time for building a model!

Spend your data budget


housing_split <- housing |>
  mutate(price = log10(price)) |>
  initial_split(prop = 0.8)

housing_train <- training(housing_split)
housing_test <- testing(housing_split)

Fit a linear regression model 🚀

Or your model of choice!

housing_fit <-
    price ~ beds + baths + area + year_built,
  ) |>
  fit(data = housing_train)

⏱️ Your turn


Split your data in training and testing.

Fit a model to your training data.


Create a deployable bundle

Deploy preprocessors and models together

Create a deployable model object

v <- vetiver_model(housing_fit, "tompkins-housing")

── tompkins-housing ─ <bundled_workflow> model for deployment 
A lm regression modeling workflow using 4 features

{vetiver} butchers and bundles your model object with relevant information for publishing.

⏱️ Your turn


Create your {vetiver} model object.

Check out the default description that is created, and try out using a custom description.

Show your custom description to your neighbor.


Version your model

How could you share your resources?

Data, models, R objects, etc.

❌ Email
❌ GitHub

🫤 Shared network drive
🫤 Dropbox, Google Drive,, etc.

✅ Amazon S3
✅ Azure
✅ Google Cloud
✅ Microsoft 365

{pins} 📌

The {pins} package publishes data, models, and other R objects, making it easy to share them across projects and with your colleagues.

You can pin objects to a variety of pin boards, including:

  • a local folder (like a network drive or even a temporary directory)
  • Amazon S3
  • Azure Storage
  • Google Cloud

Pin your model


board <- board_temp()
board |> vetiver_pin_write(v)
Creating new version '20241114T182506Z-02bc5'
Writing to pin 'tompkins-housing'

Create a Model Card for your published model
• Model Cards provide a framework for transparent, responsible reporting
• Use the vetiver `.Rmd` template as a place to start

⏱️ Your turn


Pin your {vetiver} model object to a temporary board.

Retrieve the model metadata with pin_meta().


Version your model

Fit a random forest

rf_rec <- recipe(price ~ beds + baths + area + year_built + town, data = housing_train) |>
  step_impute_mean(all_numeric_predictors()) |>

housing_fit <- workflow() |>
  add_recipe(rf_rec) |>
  add_model(rand_forest(trees = 200, mode = "regression")) |>
  fit(data = housing_train)

Version your model


board <- board_temp()
v <- vetiver_model(housing_fit, "tompkins-housing", versioned = TRUE)
board |> vetiver_pin_write(v)

⏱️ Your turn


Create a new {vetiver} model object using your linear regression model that is explicitly versioned = TRUE and pin to your temporary board.

Then train a random forest model and create a new {vetiver} model object that is also versioned = TRUE with the same name.

Write this new version of your model to the same pin, and see what versions you have with pin_versions().


Make it easy to do the right thing

Make it easy to do the right thing

  • Robust and human-friendly checking of new data
  • Track and document software dependencies of models
  • Model cards for transparent, responsible reporting

Make it easy to do the right thing

⏱️ Your turn


Open the Model Card template, and spend a few minutes exploring how you might create a Model Card for this inspection model.

Discuss something you notice about the Model Card with your neighbor.


You can deploy your model as a…


Application programming interface (API)

An interface that can connect applications in a standard way

  • Representational State Transfer (REST)
  • Uniform Resource Location (URL)

RESTful queries

  1. Submit request to server via URL
  2. Return result in a structured format
  3. Parse results into a local format

Create a {vetiver} REST API


pr() |>
  vetiver_api(v) |>

⏱️ Your turn


Create a {vetiver} API for your model and run it locally.

Explore the visual documentation.

How many endpoints are there?

Discuss what you notice with your neighbor.


What does “deploy” mean?

What does “deploy” mean?

Where can {vetiver} deploy?

  • Posit Connect
  • AWS SageMaker
  • A public or private cloud, using Docker

How do you make a request of your new API?

url <- ""

request(url) |>
  req_perform() |>

How do you make a request of your new API?

  • R packages like {httr2}
  • curl
  • There is special support in {vetiver} for the /predict endpoint

Any tool that can make an HTTP request can be used to interact with your model API!

Create a {vetiver} endpoint

You can treat your model API much like it is a local model in memory!


url <- ""
endpoint <- vetiver_endpoint(url)
predict(endpoint, slice_sample(housing_test, n = 5))

⏱️ Your turn


Create a {vetiver} endpoint object for your API.

Predict with your endpoint for new data.

Optional: call another endpoint like /ping or /metadata.




  • ML models can be deployed as APIs
  • Use {pins} to share your models
  • {vetiver} can help you bundle, version, and deploy your models
