Pauls Correlation And Regression Shite Flashcards

1
Q

Correlation background

A

0.1 small, 0.3 med, 0.5 large
P lower than alpha means correlation sig diff from 0, not a sig diff between variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Partial correlations/ 3rd variables

A

3rd variable is confounding e.g. suspect correlation between 2 variables is due to the third
Partial correlations control for 3rd and look at relationship between first two to see if still there
Writing out r small 12.3= correlation between 1 and 2 controlling for 3= 0.02, p bigger than 0.05. Correlation is 0.02 when control variable 3 so non significant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Reliability

A

Test retest: test once then later
Split half: splits questionnaire into two halves, calc scores and correlates them
Cronbachs alpha: measures internal reliability using individual items- from 0/no reliability to 1/complete reliability- correlation between all items e.g. are all items testing the same thing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Corn Bach’s alpha

A

Scale is reliable if a is bigger than 0.7. More questions means bigger a. Test only measures 1 thing so keep subscales separate, all scores in same direction, a changes across populations as only represents scores. Weak reliability means lower correlation/attenuation) due to more random variation in scores so score (a) is true score -error (random variation)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

A in spss

A

Gives you a value under reliability stats table
Item total stats table used to see trouble items, if removed, increase alpha

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Regression background

A

Predicts outcome (criterion) from predictor, relationship asymmetrical. Need to use a model e.g. line of best fit. Use y=mx+c m is gradient, x is score and c is intercept
In a linear relationship/diagonal up. C is o as line doesn’t go through y axis, m is 0.5 as for every unit of x, y goes up 0.5, y is 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Ordinary least squares

A

Method you can draw line through scores-minimises distance between predicted (Line) and actual score (dots)
Slope called b, intercept is b to lowercase 0, equation is y=b lowercase 0+b1(X)

If b is 0.469, for every unit of x, y goes up 0.469 plus intercept. E.g. for 7 on x, y is 3.75+0.469 times 7

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Slope and intercept in SPSS

A

coefficients table in SPSS: intercept is constant/B0, slope is unstandardised B/B1
Y=B0+B1X

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Residuals

A

Gap between actual data to slope (predicted). Find by taking away actual from predicted for each y/ p score, square each then add up column to get sum of squares of residuals (SSr) variance of unexplained data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Sum of squares of total and explain variance

A

take each score away from mean, square and and up- this is the variation within the data set/ sum of squares of total. Explain variance is SS total - SS residual. Bigger means better

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

R squared

A

Explain variance c affected by sample size and can’t compare between studies. R squared shows proportion of variance predicted by model. Does SS regression divided by SS of total, gives number between 0-1, bigger is better. 0.8 ,means 80% variance explained by model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

F ratio

A

See if model predicts a sig amount of variation, ratio between predicted variance and not predicted e.g. residuals. If high, means effect strong. SS over DF, also has p value for sig. T score shows whether B is sig diff to 0. Also : MS reg over MS res

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Assumptions

A

Outcome must be continuous but predictors can be either, predictors must have non 0 variance, all values of outcome should be independent (from diff people). Relationship should be linear e.g. diagonal line. Homoscedasticity: relationship between residuals and predictors must be normally distributed/ variance of error term is constant in predictors?. Residuals must be normally distributed, normality of errors using PP plot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Checking for bias/outliers

A

Check for high residuals in spss, shows outliers only 5% should be over 2sd
Cooks distance: measure influence of each case on the model, if above 1 then it is having an undue influence of model/ spss gives max cooks distance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Multiple regression background

A

Has multiple predictors and how they affect one outcome. Either forced entry or hierarchical regression. Finding explained variance and how each predictor contributes e.g. does personality predict (r square) and how much does each of OCEAN predict (F)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Adjusted R square

A

R square tells us estimate of sample but adjusted shows for whole population/allows for overestimation. Bigger the sample, less need for adjustment. Should report both

17
Q

Output for forced entry

A

Model summary has R values, meaning x % of varaince is explained. Anova has F, p and df is no. Predictors. Sig r square means model accounts for sig amount of variance and explained:unexplained is high. E.g. sig r as a whole predicts outcome. Get indiv b values for each predictor (ind contribution)

18
Q

Equation

A

B0 is still intercept, more than one predictors e.g. x1, x2, x3 so it is Y=b0+b1(x1)+b2(x2)+b3(x3)+bn(xn)
Bn is regression coefficient for the nth variable
Y is outcome

19
Q

Beta weights

A

Normal bs affected by score so can’t compare across diff measures. E.g. b1 is 0.594 as predictor increases by 1sd so outcome increases by 0.594 of a SD. Also get T value to say is variance explained by each b is sig e.g. b is amount controlled and sig is whether they impact model or not. Can have - relationships. Regression more inprntsnt than correlation

20
Q

Dummy variables

A

Use this to code categorical data using 1 and 0 e.g. gender. If beta +, 1 is higher than 0 but if b is - than 0 higher than 1. If males 1, +b means males score higher. Look at p to tell sig.

21
Q

Assumptions

A

All the same as simple regression but also have multicolliniarity bias: predictors can’t be highly correlated (as affects r square). Check in coefficients table, VIF is measure of each relationship, want this to be low /not close to 10. Tolerance is 1/VIF, needs above 0.2 but less important

22
Q

Violations

A

Robust regression is regression that allows you to ignore assumption of normally distributed residuals
Bootstrapping is using sample to build up other samples/resampling to get over normal distribution- dont need to know this?
Will be in a bootstrapping table

23
Q

Hierarchical regression

A

This is for looking at predictors, whilst controlling for another variable. Gives 2 models, extra on its own then W other predictors, need to tell if one better than the other. F change column: if change is big and sig, more variance explained in second. Anova table doesn’t compare but just gives f for each model. Adding another varibale affects the others in the coefficients table

24
Q

When to use the standardised vs unstandardised B columns

A

Unstandardised is where you look for calculating the change in original values based on change in DV. Use the og formula, constant row is b0 the next row down is b1. The standardised column is when working in standardised units and you want to see which one has the strongest influence

25
Model sum of squares and residual sum of squares equations
Model: sum no. Ppl (group mean- grand mean) squared. Residual: sum of data-mean squared