AE 15: Model Comparison

Tips Data

Author

Driver: ____, Reporter: _____, Gopher: _____

Important
  • Open RStudio and create a subfolder in your AE folder called “AE-15”.

  • Go to the Canvas and locate your AE-15 assignment to get started.

  • Upload the ae-15.qmd and tip-data.csv files into the folder you just created. The .qmd and PDF responses are due in Canvas. You can check the due date on the Canvas assignment.

Packages + data

library(tidyverse)
library(broom)
library(yardstick)
library(ggformula)
library(patchwork)
library(knitr)
library(kableExtra)

tips <- read_csv("tip-data.csv") |> 
  drop_na(Party)

What factors are associated with the amount customers tip at a restaurant? To answer this question, we will use data collected in 2011 by a student at St. Olaf who worked at a local restaurant.1

The variables we’ll focus on for this analysis are

  • Tip: amount of the tip
  • Meal: which meal this was (Lunch, Dinner, Late Night)
  • Party: number of people in the party
  • Age: Age category of person paying the bill (Yadult, Middle, SenCit)

Analysis goal

The goals of this activity are to comparing models with: - \(R^2\) vs. \(R^2_{Adj}\) - AIC and BIC

Exercise 1

Fit two models:

  1. tip_fit_1: predict Tip from Party, Age, and Meal
  2. tip_fit_2: predict Tip from Party, Age, Meal, and Day.

Exercise 2

Apply glance to both models above to find the \(R^2\) and \(R^2_{adj}\) values?

Exercise 3

  1. Which model would we choose based on \(R^2\)?
  2. Which model would we choose based on Adjusted \(R^2\)?
  3. Which statistic should we use to choose the final model - \(R^2\) or Adjusted \(R^2\)? Why?

Exercise 4

Reference the output from Exercise 2.

  1. Which model would we choose based on AIC?

  2. Which model would we choose based on BIC?

Exercise 5 (Time Permitting)

Find the best model you can! You have until the end of class! Team with the best model gets a seriously great high-five.

To submit the AE

Important
  • Render the document to produce the PDF with all of your work from today’s class.
  • Upload your QMD and PDF files to the Canvas assignment.

Footnotes

  1. Dahlquist, Samantha, and Jin Dong. 2011. “The Effects of Credit Cards on Tipping.” Project for Statistics 212-Statistics for the Sciences, St. Olaf College.↩︎