AE 08: Transformations

County Health

Author

Driver: _______, Reporter: _______, Gopher: ________

Important
  • Open RStudio and create a subfolder in your AE folder called “AE-08”.

  • Go to the Canvas and locate your AE 08 assignment to get started.

  • Upload the ae-08.qmd files into the folder you just created. The .qmd and PDF responses are due in Canvas. You can check the due date on the Canvas assignment.

library(tidyverse)
library(ggformula)
library(yardstick)
library(Stat2Data)
library(mosaic)
library(broom)
library(knitr)
library(patchwork) #arrange plots in a grid

Data

The data set for this assignment is from the Stat2Data R package which is the companion package for this course’s textbook. The data was originally generated by the American Medical Association and concerns the availability of health care in counties in the United States. You can find information here by searching for the County Health Resources dataset.

data("CountyHealth") # Loads the data from the package

It is relatively easy to count the number of hospitals a county has, whereas counting the number of doctors is much more difficult. We’d like to build a linear model to predict the number of doctors, contained the variable MDs, from the number of hospitals, Hospitals.

Exercise 1

Use a visualization and to argue why a simple linear regression model would not be appropriate for the data if we do not make any transformations.

Exercise 2

Use visualizations to find the best model that involves at least one log transformation.

Exercise 3

Fit that model and write the equation that describes it.

Exercise 4

Use visualizations to find the best model the involves at least one square root transformation. Hint: you can create square root transformations in the same way you can make log transformations by replacing log with sqrt.

Exercise 5

Fit that model and write the equation that describes it.

Exercise 6 (Time Permitting)

Is the model you created in exercise 3 or exercise 5 a “better” model? Use residuals and evaluation metrics to make your argument. Hint: you want to look at the residuals of the transformed models but look at the evaluation metrics when you transform your predictions back to the original scale.

To submit the AE:

Important
  • Render the document to produce the PDF with all of your work from today’s class.
  • Upload your QMD and PDF files to the Canvas assignment.