Introduction
Nov 01, 2024
đź“‹ AE 18 - Intro to Logistic Regression
Introduction to modeling categorical data
Logistic regression for binary response variable
Relationship between odds and probabilities
Quantitative outcome variable:
Categorical outcome variable:
Logistic regression
2 Outcomes
Multinomial logistic regression
3+ Outcomes
This data set is from an ongoing cardiovascular study on residents of the town of Framingham, Massachusetts. We want to use the total cholesterol to predict if a randomly selected adult is high risk for heart disease in the next 10 years.
high_risk:
totChol: total cholesterol (mg/dL)Complete Exercises 1-2.
Outcome: \(Y\) = 1: high risk, 0: not high risk
🛑 This model produces predictions outside of 0 and 1.
âś… This model (called a logistic regression model) only produces predictions between 0 and 1.
| Method | Outcome | Model |
|---|---|---|
| Linear regression | Quantitative | \(Y = \beta_0 + \beta_1~ X\) |
| Linear regression (transform Y) | Quantitative | \(\log(Y) = \beta_0 + \beta_1~ X\) |
| Logistic regression | Binary | \(\log\big(\frac{\pi}{1-\pi}\big) = \beta_0 + \beta_1 ~ X\) |
Note: In this class (and in most college level math classes) ((and and in R)) \(\log\) means log base \(e\) (i.e. natural log)
Complete Exercise 3.
State whether a linear regression model or logistic regression model is more appropriate for each scenario.
Use age and education to predict if a randomly selected person will vote in the next election.
Use budget and run time (in minutes) to predict a movie’s total revenue.
Use age and sex to calculate the probability a randomly selected adult will visit St. Lukes in the next year.
Suppose there is a 70% chance it will rain tomorrow
Complete Exercise 4.
log-odds
\[\omega = \log \frac{\pi}{1-\pi}\]
odds
\[e^\omega = \frac{\pi}{1-\pi}\]
probability
\[\pi = \frac{e^\omega}{1 + e^\omega}\]
\[\text{probability} = \pi = \frac{\exp\{\beta_0 + \beta_1~X\}}{1 + \exp\{\beta_0 + \beta_1~X\}}\]
Logit form: \[\log\big(\frac{\pi}{1-\pi}\big) = \beta_0 + \beta_1~X\]
Probability form:
\[ \pi = \frac{\exp\{\beta_0 + \beta_1~X\}}{1 + \exp\{\beta_0 + \beta_1~X\}} \]
Introduced logistic regression for binary response variable
Described relationship between odds and probabilities