Linear Regression Flashcards
(98 cards)
What is a simple linear regression?
It predicts ONE variable from another
Can a significant value for a coefficient (p < .05) tell us about magnititude and effect?
NO. That tell us whether estaimtes are significantly different from ZERO but not about magnitude of effect.
What does a p value really tell us in a coefficient table?
If predicted outcome variable is significantly different to zero, more than just by chance. It’s just a YES or NO.
It does not tell us about magnitude of this effect though.
In the coefficient output, for a simple linear regression, when we are looking to see what is the value of Y when X is 0, we are looking at the intercept. What coefficient value do we look to?
Unstandardised B next to the “intercept” word in the output.
The intercept is a CONSTANT value - remember this.
And, remember, constant unstandardised coefficient also tells us about the slope! so next to the "variable" beneath the "intercept" word, that figure tells us the direction of the relationship. Standardised beta (b) would give magnitiude of effect.
Remember our model has this formula – positive affect (outcome variable) is predicted by first, the intercept, which in this output is the CONSTANT UNSTANDARDISED B 2.853, that is the value of Y where X is 0. Sometimes its called b naught
In the coefficient output, for a simple linear regression, when we are looking to see what is the value of Y when X is 0, we are looking at the intercept. What coefficient value do we look to?
Unstandardised B next to the “intercept” word in the output.
The intercept is a CONSTANT value - remember this.
And, remember, constant unstandardised coefficient also tells us about the slope! so next to the "variable" beneath the "intercept" word, that figure tells us the direction of the relationship. Standardised beta (b) would give magnitiude of effect.
Remember our model has this formula – positive affect (outcome variable) is predicted by first, the intercept, which in this output is the CONSTANT UNSTANDARDISED B 2.853, that is the value of Y where X is 0. Sometimes its called b naught
Why is it better to use the standardised coefficient instead of unstandardised coefficient when looking at effect size?
Because standardised shows us for every 1 SD the predictor variable changes, X amount SD the outcome variable changes. As opposed to how many units change (Unstandardised). It helps to use SDs when comparing models as units will always be in SDs rather than arbritrary units depending on measures/variables
So, we can get effect sizes from standardised coefficients. How can we get effect sizes by examining variance?
Through looking at the r squared.
R squared indicated the proportion or percentage of total variance accounted for by the model.
What is R squared?
The squared CORRELATION between the ACTUAL DV scores and the PREDICTED DV scores.
Essentially it is the proportion of variance explained by the model.
What is another word for R?
Correlation
What is another word for R2?
Squared correlation
The variance of an outcome variable is 5. The regression tells us the variance of the residuals is 4. We then substract residual variance from total variance, which leads us to…?
The variance explained by the model.
The model explained variance.
R2. Squared correlation.
What does 0 in R2 indicate?
What does 1 in R2 indicate?
0 indicates NONE of the variance is explaiend by the model
1 indicates ALL of the variance is explained by the model
Do people consider a .25 r2 as small?
No.
.04 is considered small (4%)
.09 is medium (9%)
.25 is large (25%)
Are effect sizes with r squared definitive or are they t shirt?
T shirt. no set rules.
So what is adjusted r square?
As opposed to the r2 looking at proportion of variance explained by the model derived from data from a specific sample, ADJUSTED r square gives an estimate of the r2 in the population!
Meaning, how much variability would be explained if the MODEL was derived from the population rather than the sample.
It is more conservative.
Why would an adjusted r square be important - why can’t you just use the r squared provided on estimate for the population?
Because the regression model might overfit your particular data set. Therefore it may not work as well with other samples as it does with YOUR data.
Why can r2 be expected to vary?
Because sample correlations vary around the population correlation
True or false: the sample becomes less represenative as the sample size decreases
TRUE
What is sampling error?
The discrepancy between sample and population
Why does sampling error increase as sample size decreases AND as the number of predictors increase ?
Regarding predictors, becuase there is error associated with each predictor
And because the sample isnt representative of the population, smaller sample size no good
The R2 is likely to overestimate the size of the effect because:
A. Sampling error decreases as sample size decreases
B. Sampling error increases as sample size decreases
C. Sampling error increases as the number of predictors increase
D. B and C
D
Regression chooses the ____ therefore it is prone to overfitting the data
Best fit
What is failure to replicate the r2 called?
Shrinkage
How is shrinkage best evaluated?
Cross Validation Study