Further Techniques in Multiple Linear Regression Flashcards
(15 cards)
What does linear regression assume?
That an independent predictor variables and the outcome dependent variable are related linearly.
How do we fix a non linear regression model?
- By transforming the variables in the model
- By fitting polynomial relationship instead of a straight line
How does polynomial fitting work?
- The principle of fitting a straight line applies, but we are just turning one predictor variable into two or more
- Instead of predicting y on x, we predict y on x, and x^2 (and possibly x^3, x^4, etc)
List an advantage of polynomial fitting.
Allows us to deal with obvious non-linear relationship without having to specify in advance what the appropriate transformation would be
How do we fit a quadratic model?
- We transform x to x^2
- y = β(0)+ β(1)x+ β(2)x^2 + e
How do we interpret quadratic terms?
- Where x > 0 and x^2 < 0, y is increasing in x at first, but will eventually turn around and be decreasing
- Where x < 0 and x2 > 0, y is decreasing in x at first, but will eventually turn around and being increasing
What is an advantage of transformed data?
- Often less skewed
- Outliers are less extreme
List three examples of transformed data.
- Logarithms
- Inverse
- Square root
What is an interaction effect?
When the effect of an explanatory variable depends on the level of another explanatory variable.
Give an example of an interaction effect.
- Assume male happiness increases with years of marriage, whereas females happiness decreases with years of marriage
- The relationship between happiness and years of marriage may be linear, but it would not be independent of sex.
Write the regression equation of a model with interaction terms.
y = β0+ β(1)x^1+ β(2)x^2+ β(3) (x(1)x(2)) + e
where β3 (x1x2) is a multiplicative term of the two main effects
What is the most common centring method?
Mean centring
How can we ensure B(0) is interpretable?
By changing x by centring the age variable.
What changes when you carry out mean centring?
The intercept which now corresponds to the average age, e.g. x(centred), not x=0