library(tidyverse)
library(broom)
library(yardstick)
library(ggformula)
library(patchwork)
library(knitr)
library(kableExtra)
tips <- read_csv("data/tip-data.csv") |>
drop_na(Party)AE 15: Model Comparison
Tips Data
Open RStudio and create a subfolder in your AE folder called “AE-15”.
Go to the Canvas and locate your
AE-15assignment to get started.Upload the
ae-15.qmdandtip-data.csvfiles into the folder you just created. The.qmdand PDF responses are due in Canvas. You can check the due date on the Canvas assignment.
Packages + data
What factors are associated with the amount customers tip at a restaurant? To answer this question, we will use data collected in 2011 by a student at St. Olaf who worked at a local restaurant.1
The variables we’ll focus on for this analysis are
Tip: amount of the tipMeal: which meal this was (Lunch,Dinner,Late Night)Party: number of people in the partyAge: Age category of person paying the bill (Yadult,Middle,SenCit)
Analysis goal
The goals of this activity are to comparing models with:
- \(R^2\) vs. \(R^2_{Adj}\)
- AIC and BIC
Exercise 1
Fit two models:
tip_fit_1: predictTipfromParty,Age, andMealtip_fit_2: predictTipfromParty,Age,Meal, andDay.
Exercise 2
Apply glance to both models above to find the \(R^2\) and \(R^2_{adj}\) values?
Exercise 3
- Which model would we choose based on \(R^2\)?
- Which model would we choose based on Adjusted \(R^2\)?
- Which statistic should we use to choose the final model - \(R^2\) or Adjusted \(R^2\)? Why?
Exercise 4
Reference the output from Exercise 2.
Which model would we choose based on AIC?
Which model would we choose based on BIC?
Exercise 5 (Time Permitting)
Find the best model you can! You have until the end of class! Team with the best model gets a seriously great high-five.
To submit the AE
- Render the document to produce the PDF with all of your work from today’s class.
- Upload your QMD and PDF files to the Canvas assignment.
Footnotes
Dahlquist, Samantha, and Jin Dong. 2011. “The Effects of Credit Cards on Tipping.” Project for Statistics 212-Statistics for the Sciences, St. Olaf College.↩︎