Oct 21, 2024
đź“‹ AE 13 - Rail Trails
rail_trail# A tibble: 90 Ă— 7
volume hightemp avgtemp season cloudcover precip day_type
<dbl> <dbl> <dbl> <chr> <dbl> <dbl> <chr>
1 501 83 66.5 Summer 7.60 0 Weekday
2 419 73 61 Summer 6.30 0.290 Weekday
3 397 74 63 Spring 7.5 0.320 Weekday
4 385 95 78 Summer 2.60 0 Weekend
5 200 44 48 Spring 10 0.140 Weekday
6 375 69 61.5 Spring 6.60 0.0200 Weekday
7 417 66 52.5 Spring 2.40 0 Weekday
8 629 66 52 Spring 0 0 Weekend
9 533 80 67.5 Summer 3.80 0 Weekend
10 547 79 62 Summer 4.10 0 Weekday
# ℹ 80 more rows
Source: Pioneer Valley Planning Commission via the mosaicData package.
Complete Exercise 0 to fit the so-called “full-model”.
| term | estimate | std.error | statistic | p.value |
|---|---|---|---|---|
| (Intercept) | 17.622161 | 76.582860 | 0.2301058 | 0.8185826 |
| hightemp | 7.070528 | 2.420523 | 2.9210743 | 0.0045045 |
| avgtemp | -2.036685 | 3.142113 | -0.6481896 | 0.5186733 |
| seasonSpring | 35.914983 | 32.992762 | 1.0885716 | 0.2795319 |
| seasonSummer | 24.153571 | 52.810486 | 0.4573632 | 0.6486195 |
| cloudcover | -7.251776 | 3.843071 | -1.8869743 | 0.0627025 |
| precip | -95.696525 | 42.573359 | -2.2478030 | 0.0272735 |
| day_typeWeekend | 35.903750 | 22.429056 | 1.6007696 | 0.1132738 |
Our model conditions are the same as they were with SLR:
Linearity: There is a linear relationship between the response and predictor variables.
Constant Variance: The variability about the least squares line is generally constant.
Normality: The distribution of the residuals is approximately normal.
Independence: The residuals are independent from each other.
Look at a plot of the residuals vs. predicted values
Look at a plot of the residuals vs. each predictor
Linearity is met if there is no discernible pattern in each of these plots
Complete Exercises 1-4
The plot of the residuals vs. predicted values looked OK
The plots of residuals vs. hightemp and avgtemp appear to have a parabolic pattern.
The linearity condition does not appear to be satisfied given these plots.
Given this conclusion, what might be a next step in the analysis?
Does the constant variance condition appear to be satisfied?
The vertical spread of the residuals is not constant across the plot.
The constant variance condition is not satisfied.
Given this conclusion, what might be a next step in the analysis?
Complete Exercises 5-6.
The distribution of the residuals is approximately unimodal and symmetric, so the normality condition is satisfied. The sample size 90 is sufficiently large to relax this condition if it was not satisfied.
We can often check the independence condition based on the context of the data and how the observations were collected.
If the data were collected in a particular order, examine a scatter plot of the residuals versus order in which the data were collected.
If there is a grouping variable lurking in the background, check the residuals based on that grouping variable.
Why might the independence condition be violated here?
Residuals vs. order of data collection:
