The bias in linear models (L7) Flashcards

Question 1

Q

What are standardized beta values?

Answer

A

Tell you something about the change in the outcome associated with a uni change in the predictors, where the values are expressed as SDs (making it easy to compare multiple predictors).

Question 2

Q

What is the hierarchical method of selecting predictors?

Answer

A

Experimenter driven - decides the order the parameters are added to the model. Useful for theory testing. Generally, known predictors entered into the model first.

Question 3

Q

What is forced entry method of selecting predictors?

Answer

A

Enter all the parameters at once and see what happens.

Question 4

Q

What is stepwise method of selecting predictors?

Answer

A

Statistically select using semi-partial correlation with outcome, used only for exploratory analysis. First predictor should be the one which has the highest correlation with the outcome, second is the second biggest correlation etc etc.

Question 5

Q

What are residual statistics?

Answer

A

Difference between model and data to asses accuracy of model.

Question 6

Q

What are influential cases?

Answer

A

Model doesn’t fit specific cases very dramatically (NOT outliers).

Question 7

Q

What percentage of standardized residuals should lie between +/- 1.96 SDs of the mean?

Question 8

Q

What percentage of standardized residuals should lie between +/- 2.5 SDs of the mean?

Question 9

Q

What is an outlier?

Answer

A

Case for the absolute value of the standardized residual is +/- 3 SDs away from the mean.

Question 10

Q

Why are influential cases bad?

Answer

A

They alter the model (ie. the gradient changes if they are included in the model).

Question 11

Q

What is Cook’s distance?

Answer

A

Value produced for every data case to quantify it’s influence on the model (done through calculating the the model with and without each data set to assess differences in b values). Should be less than 1. .

Question 12

Q

What are the assumptions of linear models?

Answer

A

1) Must be a continuous outcome.
2) Predictor variables should be continuous.
3) Non-zero variance (predictor values must vary)
4) Independence (error should be uncorrelated)
5) No multi-collinarity (high correlation between predictors).

Question 13

Q

What are ZRESID and ZPRED?

Answer

A

Assess homogeneity of variance, comparison of residuals. Don’t want to see any patterns in the output. Funnel shaped = homodescedasticity, boomerang = non-linearity.

Question 14

Q

How do we diagnose co-linearity?

Answer

A

Tolerance and VIF.

Question 15

Q

What is tolerance?

Answer

A

1/VIF; should be > 0.2

Question 16

Q

What is VIF?

Answer

Study These Flashcards

A

1/tolerance; should be less than 10

Question 17

Q

What percentage of standardized residuals should fall within +/- 3.29?

Answer

Study These Flashcards

A

99.9%

Question 18

Q

How do we assess independence?

Answer

Study These Flashcards

A

Durbin-Watson test: Results vary between 0 and 4, but value should optimally be 2, to show full independence.

Question 19

Q

What does the information below the ‘model summary’ on a multiple regression show you?

Answer

Study These Flashcards

A

What predictors are being tested, but only the ones in the model will be significant if you are using step-wise or forced entry.

Question 20

Q

What is special about the casewise diagnostics output?

Answer

Study These Flashcards

A

Finds outliers for you, only 5% of cases should appear here. If you remove any from your data set you must explain why!

The bias in linear models (L7) Flashcards

(20 cards)