# The bias in linear models (L7) Flashcards

1
Q

What are standardized beta values?

A

Tell you something about the change in the outcome associated with a uni change in the predictors, where the values are expressed as SDs (making it easy to compare multiple predictors).

2
Q

What is the hierarchical method of selecting predictors?

A

Experimenter driven - decides the order the parameters are added to the model. Useful for theory testing. Generally, known predictors entered into the model first.

3
Q

What is forced entry method of selecting predictors?

A

Enter all the parameters at once and see what happens.

4
Q

What is stepwise method of selecting predictors?

A

Statistically select using semi-partial correlation with outcome, used only for exploratory analysis. First predictor should be the one which has the highest correlation with the outcome, second is the second biggest correlation etc etc.

5
Q

What are residual statistics?

A

Difference between model and data to asses accuracy of model.

6
Q

What are influential cases?

A

Model doesn’t fit specific cases very dramatically (NOT outliers).

7
Q

What percentage of standardized residuals should lie between +/- 1.96 SDs of the mean?

A

95%.

8
Q

What percentage of standardized residuals should lie between +/- 2.5 SDs of the mean?

A

99%

9
Q

What is an outlier?

A

Case for the absolute value of the standardized residual is +/- 3 SDs away from the mean.

10
Q

A

They alter the model (ie. the gradient changes if they are included in the model).

11
Q

What is Cook’s distance?

A

Value produced for every data case to quantify it’s influence on the model (done through calculating the the model with and without each data set to assess differences in b values). Should be less than 1. .

12
Q

What are the assumptions of linear models?

A

1) Must be a continuous outcome.
2) Predictor variables should be continuous.
3) Non-zero variance (predictor values must vary)
4) Independence (error should be uncorrelated)
5) No multi-collinarity (high correlation between predictors).

13
Q

What are ZRESID and ZPRED?

A

Assess homogeneity of variance, comparison of residuals. Don’t want to see any patterns in the output. Funnel shaped = homodescedasticity, boomerang = non-linearity.

14
Q

How do we diagnose co-linearity?

A

Tolerance and VIF.

15
Q

What is tolerance?

A

1/VIF; should be > 0.2

16
Q

What is VIF?

A

1/tolerance; should be less than 10

17
Q

What percentage of standardized residuals should fall within +/- 3.29?

A

99.9%

18
Q

How do we assess independence?

A

Durbin-Watson test: Results vary between 0 and 4, but value should optimally be 2, to show full independence.

19
Q

What does the information below the ‘model summary’ on a multiple regression show you?

A

What predictors are being tested, but only the ones in the model will be significant if you are using step-wise or forced entry.

20
Q

What is special about the casewise diagnostics output?

A

Finds outliers for you, only 5% of cases should appear here. If you remove any from your data set you must explain why!