Multiple Regression

Week 4


Concepts

Multiple Regression

  • the multiple regression equation
  • quantifying the fit using R^{2} and s_{est}
  • meaning of each model coefficient (weight)
  • how to find the best single predictor variable
  • stepwise procedure for how to find the best set of predictors
  • how to assess whether a variable adds significantly to the predictive power
  • p-value ; R^{2} adj ; AIC; the step() procedure
  • Assumptions of regression and how to assess them for your data/model
  • normality, heteroscedasticity, nonlinearity

Readings

We will be using the textbook Learning Statistics with R by Danielle Navarro. (pdf version) We will also be using the textbook OpenIntro Statistics. Both are made available online, free.

Required Readings

  • 15 Linear regression (includes multiple regression) in Learning Statistics with R (pdf version)
  • Chapter 9 “Multiple and logistic regression” of OpenIntro Statistics (but you can ignore section 9.5 “Introduction to logistic regression”, we will be covering logistic regression next week)
  • note: in Chapter 9 of OpenIntro Stats they use an example that includes categorical variables as independent variables… in 2812 we will not be covering this, we will only deal with continuous independent variables in a multiple regression. You won’t be tested on categorical variables in this course.

Additional supporting materials

Lecture Supplemental Video

Here is a supplemental video finishing the lecture slides we were unable to get to in week 4 in class. The video covers the stepwise regression procedure in R used to refine a model to include only the most important predictor variables.