The bias in linear models (L7) Flashcards

1
Q

What are standardized beta values?

A

Tell you something about the change in the outcome associated with a uni change in the predictors, where the values are expressed as SDs (making it easy to compare multiple predictors).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the hierarchical method of selecting predictors?

A

Experimenter driven - decides the order the parameters are added to the model. Useful for theory testing. Generally, known predictors entered into the model first.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is forced entry method of selecting predictors?

A

Enter all the parameters at once and see what happens.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is stepwise method of selecting predictors?

A

Statistically select using semi-partial correlation with outcome, used only for exploratory analysis. First predictor should be the one which has the highest correlation with the outcome, second is the second biggest correlation etc etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are residual statistics?

A

Difference between model and data to asses accuracy of model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are influential cases?

A

Model doesn’t fit specific cases very dramatically (NOT outliers).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What percentage of standardized residuals should lie between +/- 1.96 SDs of the mean?

A

95%.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What percentage of standardized residuals should lie between +/- 2.5 SDs of the mean?

A

99%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is an outlier?

A

Case for the absolute value of the standardized residual is +/- 3 SDs away from the mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Why are influential cases bad?

A

They alter the model (ie. the gradient changes if they are included in the model).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is Cook’s distance?

A

Value produced for every data case to quantify it’s influence on the model (done through calculating the the model with and without each data set to assess differences in b values). Should be less than 1. .

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the assumptions of linear models?

A

1) Must be a continuous outcome.
2) Predictor variables should be continuous.
3) Non-zero variance (predictor values must vary)
4) Independence (error should be uncorrelated)
5) No multi-collinarity (high correlation between predictors).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are ZRESID and ZPRED?

A

Assess homogeneity of variance, comparison of residuals. Don’t want to see any patterns in the output. Funnel shaped = homodescedasticity, boomerang = non-linearity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How do we diagnose co-linearity?

A

Tolerance and VIF.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is tolerance?

A

1/VIF; should be > 0.2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is VIF?

A

1/tolerance; should be less than 10

17
Q

What percentage of standardized residuals should fall within +/- 3.29?

A

99.9%

18
Q

How do we assess independence?

A

Durbin-Watson test: Results vary between 0 and 4, but value should optimally be 2, to show full independence.

19
Q

What does the information below the ‘model summary’ on a multiple regression show you?

A

What predictors are being tested, but only the ones in the model will be significant if you are using step-wise or forced entry.

20
Q

What is special about the casewise diagnostics output?

A

Finds outliers for you, only 5% of cases should appear here. If you remove any from your data set you must explain why!