Model 2: Linear Regression Models Flashcards
(14 cards)
Model Building Principles
- Consider all predictor variables that may be important in describing the outcome.
- Check for multicollinearity
- Consider possible interactions between variables.
- Think about expected value of predictors.
What model selection approaches are covered in STAT210?
Gelman and Hill
Backward/Forward/Sidewise-Step Selection
Akiake’s Information Criterion (AIC)
Describe the Gelmen & Hill Approach
Look at the statistical significance and sign of estimated effect.
- significant & expected = keep
- non-significant & expected = find to keep unless interaction
- significant & unexpected = possible lurking variables
- non-significant & unexpected = remove.
Describe Backwards Selection
- Start with all possible predictors
- Remove predictor with largest p-value
- Refit model
- Repeat unit all predictors included are significant.
remember if the interaction is keep, the interactions main effect must be keep
Describe Forward Selection
- Start with the basic model (no predictors in model)
- Consider all possible predictor variables at one time. Include the predictor with the smallest p-value
- Consider all possible predictors, minus the previous. Include the predictor with the smallest p-value.
- Repeat until no significant predictors are left.
Describe Sidewise-Step Selection
A combination of backward and forward selection, ensure the model is refit with each change.
Describe AIC Selection
1) Narrow down selection to smallest number of possible models.
2) Specify the basic and full model, include models between the two.
3) Choose the model with the smallest AIC score.
What does a high/low/similar score in AIC mean?
High score means the model has a lot of parameters.
Low score means the model is a good fit.
AIC with similar scores mean the models are similarly a good fit. Discussing which model to choose may depend on whether you want to prioritise simplicity or covariates.
Cons and Benefits of Gelmen & Hill Approach
This approach is a helpful guideline for model building and invites discussion about variables. However, this method can be easily abused.
Cons and Benefits of Backward/Forward/Sidewise-Step Approach
These approaches are easy and fast, however, each method can decide different models are best.
Cons and Benefits of AIC Approach
This approach thinks about the model upfront, but may lead to multiple models being a possible good fit.
What are additive models?
When explanatory/predictor variables are summed, which means the individual independent effects are summed. Visually there should only be one slope on the plot.