Flashcards in Unit 2: Multiple Linear Regression Deck (23):
What is the purpose of MLR?
To predict a response variable (Y) using a set of predictor variables (X1, X2,...Xi)
What is the method of MLR
What are the assumptions for MLR?
Linearity: Y can be modeled as a linear function of the independent variables.
Independence: Observations are independent and of equal importance; predictors are linearly independent of each other (no multicollinearity)
Normality of errors: Errors are independent and normally-distributed with zero mean
and constant variance
How can we check if a linear relationship is appropriate?
1. Plot of the residuals against the fitted values (y)
2. Plot of the residuals against each predictor variable(xij)
How can we check if the error assumptions are appropriate?
1. Plot of the residuals against the fitted values (y)
2. Plot of the residuals against each predictor variable (xij)
3. Histogram and/or normal probability plot of the residuals
4. Plot of the residuals against the index or order of data collection (to check independence)
What is the overall F-Test?
What does it mean when we reject the null of a n Overall F test?
Tests that the entire collection of independent variables are associated with the outcome.
Rejecting H0 indicates that the model with all predictors is better than an intercept-only model; further testing may be needed.
(H0: All Bj = 0
H1 : at least one Bj not equal to 0)
What is the partial T-Test?
What does it mean to reject the null of a partial t-test?
Tests that a specific independent variable is associated with the outcome, given the association with the other predictors has
already been accounted for
Rejecting H0 : j = 0 implies that there is signicant evidence of a
linear association between Xj and Y, given all other predictors are
already included in the model
What is the partial F-Test?
What are the hypotheses?
Tests that a specific collection of independent variables associated with the outcome, given the association with the other predictors has already been accounted for.
The reduced has to be a nested version of the full model.
H0: Reduced is better than the full
H1: Full model is the better model
Rejecting H0 indicates that the full model is better than the reduced model; further testing may be needed.
How can you check for multicollinearity?
Checking for multicollinearity problems:
Plot predictor variables against each other
Look for large sample correlation coefficients
Look for large variance inflation factors (VIFs)
How can we solve for the unconditional variance of Y using the ANOVA table?
We can multiply the SST by n-1.
Will SSM overlap for independent predictors?
No! Independent predictors will not have overlapping SSMs.
Can the Adjusted R2 be negative?
YES! for really poor models where there are too many predictors, since it penalizes for number of predictors
Type I SSM Characteristics
1. 'Sequential sums of squares'
2. Predictor-order matters
3. Sums to the overall SSM
4. Useful for conducting partial F-tests
Type III SSM Characteristics
1. `Partial sums of squares'
2. Predictor-order does not matter
3. Does not sum to the SSM (unless predictors are independent)
4. Useful for computing partial correlations and partial R2
When is a variable a confounder?
Variable Z is a confounder (`lurking variable') if it's inclusion changes the relationship between X and Y (e.g., Department
confounds the relationship between gender and admission rates
When is there interaction/effect modifier?
`The relationship between X and Y depends on the values of Z.'
Interactions (`effect modiers') can be used to account for a relationship between Y and X that varies across the levels/values of Z
Why center predictors?
Change the interpretation of B0: The average value of the response variable at the average
value of the predictors (i.e., y).
It helps to alleviate `variance inflation' issues associated with fitting models with higher-order polynomial terms, a special case of
Why would you standardize your predictors?
The magnitude of coefficient estimates is comparable acrosspredictors
It puts all predictors `on an equal playing eld' when building a model
Similar to centering, it helps alleviate a special type of multicollinearity issue introduced when fitting models with higher-order polynomial terms
You should only standardize continuous predictors with roughly Normal distributions - do not standardize categorical predictors!
Why is it a problem if the predictors are correlated?
There are typically large changes between the regression coefficients
in the unadjusted and adjusted models.
It is dicult to interpret the regression coefficients, because the
`holding all other predictors constant' statement is not reasonable.
The standard errors will be inflated, which causes problems with
inference (i.e., p-values are too big).
What is the Hierarchical Principle?
Higher-order terms should only be included if the corresponding 'main effects' are also included
Categorical variables should enter the model 'all or nothing'
What are the implications of the hierarchical principal?
Avoid splitting up dummy variables representing categorical predictors
Only consider additional polynomial terms if the lower-order terms are already included
Only consider interactions between variables that are included in your model
What is usually the result of using internal validation procedures?
Using the same data to both train and validate your model will result in measures of model t that are too optimistic