Week Ten Flashcards
(11 cards)
For multiple linear regression what must the explanatory variable be?
Continuous or dichotomous/binary (dummy variables)
For multiple linear regression what must the dependant variable be?
Continuous (interval or ratio)
Define collinearity
When two variables are highly correlated, they are a near perfect linear combination of one another.
Define multicollinearity
When two or more variables are highly correlated.
What happens as multicollinearity increases?
The estimates of the coefficients become unstable, standard error too high, and the t-statistic too low.
What is tolerance?
The percentage of variance in the x variable that cannot be explained by the other x variables.
What does it mean if tolerance is very low?
The predictor variable is redundant. But if fit is lower than 0.1 then we must investigate further.
What is the calculation for variance inflation factor (VIF)?
1 ➗ tolerance.
If this is higher than 10 we must investigate further.
Is heteroskedasticity a significant problem?
No although the estimated of standard error, t-value and p-values will be inaccurate, it won’t be excessive. So long as the x-variables are symmetric and about their means the OLS estimates will still be unbiased.
What is dummy coding?
Dummy coding allows us to use categorical or nominal data in linear regression models.
How do we dummy code?
We convert the variable in 0s and 1s for each category except one, making many variables out of one variable.