Regression The Linear Model Flashcards

(53 cards)

1
Q

What does a linear model with several predictors look like on a graph?

A

A 3d regression plane

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

SSr

A

Residual Sum of Squares. How well a linear model fits the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is Cross validation of linear regression model

A

Ensures model accurately predicts samw outcome in a different group of people.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Methods of cross validation

A

Adjusted R squared
Steins method
data splitting

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does adjusted R squared do?

A

Tells how much variance in Y would be accounted for if the model had been derived from population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What does Steins formula do?

A

Tells how well model cross validates

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

2 oversimplified common rules of thumb for sample size when using linear model?

A

10-15 cases per predictor

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a good method of deciding desired sample size?

A

Desired effect size

Amount of power wanted for statistical significance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Size of sample for large effect

A

77 participants with up to 20 predictors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

If medium effect expected use sample size of

A

55-150 (20 predictors)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

If small effect expected use sample size of

A

1043 cases with 20 predictors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

3 main stages in fitting a linear model

A

Initial data checks
Run initial regression
Check residuals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

4 Steps in initial checks when fitting linear model

A

Check linearity and unusual cases
Graphs: scatter plots
If lack of linearity
Transform data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Fitting linear regression model: run initial regression

A

Save diagnostic statistics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Fitting linear regression model : check residuals

A
Use zpred and zresid graphs to check 3 things
Linearity
Homodasticity
Independence
Check normality with histogram
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Fit general linear model: If glm assumptions met and no bias

A

Model can be generalized

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Fit general linear model: If heteroscedasticity found.?use either

A

Weighted least squares regression
OR
Bootstrap and transform data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Fit general linear model: If no normality

A

Bootstrap and transform
OR
Use a multi level model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Fit general linear model: If data lacks independence

A

Use a multi level model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Glm : multicollinearity defn

A

Strong corellation between 2+ predictor variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Is less than perfect collinearity avoidable?

A

No. It is virtually unavoidable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What does perfect collinearity mean?

A

One predictor variable has a perfect linear correlation

23
Q

Glm: multicollinearity and untrustworthy b’s

A

As correlation between two predictor variables increases the standard error of b’s increases. This increases the chance b is unrepresentative of the population

24
Q

Compare glm with one predictor: what does large value of r squared mean

A

Better fit of model

25
Compare glm with one predictor: test Stat for assessing significance of r sqaured
F statistic | How much variability model explains relative to what isn't explained.
26
What is b value?
Gradient of regression line. Strength of relationship.
27
What does multicollinearity do to r
Limits the size
28
Why does multicollinearity limit the size of r
The correlation of two or more predictor variables mean they account for same variance portions. Give less unique variance to r squared.
29
Why multicollinearity is a problem
It makes it difficult to assess the importance of each predictor variable
30
Two steps for Identifying multicollinearity
Correlation matrix | Variance inflation factor
31
Identify multicollinearity by scan correlation matrix
Find highly correlated predictor variables. | R>=. 8 or. 9
32
Identifying multicollinearity: variance inflation factor indicates
If predictor variable has strong linear corellation with another predictor variable
33
Identifying multicollinearity : interpretation of variance inflation factor
Largest vif = 10 tolerance =. 10 serious problem indicated Average vif substantially greater than 1 regression may be biased Tolerance
34
What is an eigenvalue
The length of the line that goes from each side of an eclipse drawn around a scatter plot.
35
What is an eigenvector?
The two lines that go from each side of an ellipse drawn around the scattorplot of data.
36
1what happens to residuals when a model is a poor fit?
Residuals will be large.
37
What are three types of residuals?
standardised, unstandardised, and studentised.
38
What are unstandardised residuals?
The raw difference between predicted and observed scores.
39
What are standardised residuals?
unstandardised residuals converted to z scores.
40
What are studentised residuals?
Unstandardised residuals divided by the estimation of the standard deviation.
41
Name six ways to assess influential cases.
Mahalanobis distance, cooks distance, deleted residuals studentised deleted residuals, leverage (hat) values, DFFit.
42
What is adjusted predicted value?
the predicted value of the outcome for a case from a model where the case has been deleted.
43
What is the deleted residual
The different between the adjusted predicted value and the observed value.
44
What is the studentised deleted residual?
A deleted residual divided by the standard error.
45
What is the leverage (hat) value?
Influence of observed value of outcome variable over the predictor variable
46
What is mahalanobis distance?
The distance of cases from the mean of the predictor variable.
47
What is cooks distance?
A measure of the overall influence of a case on the model.
48
What is DFFit?
The difference between the adjusted predicted value and the original predicted value.
49
What is DFBeta?
The difference between a parameter estimated using all cases and estimated when one case is excluded.
50
What is covariance ratio (CVR)?
quantifies the degree to which a case influences the variance of the regression parameters.
51
Check the assumptions of heterodasticity and linearity of residuals by?
plot standardised predicted values vs standardised residual values. If random array data is linear and homoscedastic. Partial plot: residuals of outcome variable vs. each predictor variable evenly spaced dots around line indicate homoscedasticity.
52
Test normality of residuals?
histogram, probability plot
53
What statistics does bayesian regression give?
estimation of b, 95% credible of intervals for model parameters, eg: 95% probability population value of b lies between...