AE 05: Predicting hotel price (with numeric engineering!)

Application exercise

September 17, 2024


reg_metrics <- metric_set(mae, rmse, rsq)

hotel_rates <- hotel_rates |> 
  sample_n(5000) |> 
  arrange(arrival_date) |> 
  select(-arrival_date) |> 
    company = factor(as.character(company)),
    country = factor(as.character(country)),
    agent = factor(as.character(agent))

hotel_split <- initial_split(hotel_rates, strata = avg_price_per_room)

hotel_train <- training(hotel_split)
hotel_test <- testing(hotel_split)
hotel_folds <- vfold_cv(hotel_train, strata = avg_price_per_room)
Adjust for skewness

Your turn: Examine hotel_train and identify a numeric predictor that is skewed. Incorporate an appropriate transformation into the recipe below and estimate a linear regression model using 10-fold cross-validation. How does the model perform with and without the transformation?

hotel_rec <- recipe(avg_price_per_room ~ ., data = hotel_train) |>
  step_dummy(all_nominal_predictors()) |>

Use GGally::ggpairs() to generate bivariate comparisons for all variables in your dataset.

# add code here

Spline functions

Your turn: Implement a natural spline for lead_time and historical_adr. Use grid tuning to determine the optimal value for deg_free. Evaluate the model’s performance.

# add code here

MARS model

Your turn: Implement a MARS model. Use grid tuning to determine the optimal value for num_terms and prod_degree. Evaluate the model’s performance.

# add code here