Correlation and Regression Learning Objectives Flashcards

(20 cards)

1
Q

Define covariance (in words) and explain its main limitations

A

Covariance reflects the degree to which two variables (X and Y) vary together. Mathematically, it is the average cross-product of the deviation scores (how much each score on a variable deviates from the variable’s mean). It’s main limitations include it being an unstandardised measure of the relationship between two variables. It is scale dependant, meaning the values can vary drastically depending on the range of the scale, and when interpretating results, information about the scale is required. Furthermore, you are not able to directly compare covariances that are based on different scales.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Define correlation in relation to covariance and explain how correlation addresses the limitations of covariance

A

Correlation typically refers to Pearson’s r, which is a standardised measure of the relationship between two variables, and is an index of the magnitude/strength and direction of that relationship - essentially it is standardised covariance. It successfully addresses the limitations of covariance, as it is not scale dependant - it can be directly compared across scales and studies, and is meaningful and interpretable regardless of the scale variables were originally measured on.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

List the various terms used to describe a bivariate correlation

A

Pearson’s r, Pearsons correlation, zero order correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Define the coefficient of determination in the context of bivariate correlation

A

Within the context of bivariate correlation, the coefficient of determination (r^2) indicates the proportion of variance in one variable that can be explained by the other variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Explain the relationship between the coefficient of determination and error/residual variance

A

r^2 can be used to calculate the error/residual variance (1-r^2), which is the proportion of variance that cannot be explained by the other variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Explain what question is being addressed when we test r for significance and which statistical test is used for this purpose

A

When we test r for significance, we are addressing the question of whether or not, r is likely to reflect an actual relationship in the population, or if it just emerged by chance/sampling error. To do this, a t-test is used.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Explain the difference between r2 and r2adj

A

R^2 adj is an adjusted, less biased estimate of R^2 (coefficient of multiple determination - the variance in Y jointly accounted for by all predictors). It accounts for any overlap between the predictors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Explain the link between correlation and bivariate regression

A

Bivariate regression is where you estimate a score on one variable (Y) on the basis of scores on another variable (X). If we know the correlation between X and Y, we can then work out the line of best fit/regression line, which allows us to make more accurate predictions through a linear model of the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Explain what the least squares criterion is in bivariate regression

A

The least squares criterion is used by the regression line to ensure that errors (deviations of actual scores from the predicted scores based on the regression line) are as small as possible. It minimises the sum of squared errors.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Explain what the standard error of the estimate is (in words) and what it tells us in bivariate regression

A

The standard error of the estimate is a calculated estimate of the average amount of error associated with prediction based on the regression line.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

In conceptual terms (i.e., in words), explain what is represented by SSY, SSregression, and SSresidual in bivariate regression

A

SSY represents the total variability in Y around the mean (sometimes known as SStotal, and is the sum of squared deviations between participants actual scores and the mean score), which then can be partitioned into SSregression (variability in Y that can be explained/predicted by X - sum of squared deviations between participants predicted scores and the mean score), and SSresidual (variability in Y that cannot be explained/predicted by X - sum of squared deviations between participants actual scores and their predicted scores).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Explain what the various components of the bivariate linear regression equation represent

A

The standardised regression equation takes the form:
ZY = ßZX person’s predicted z-score on variable Y
zx person’s z-score on variable X
ß standardised regression coefficient
(SD change in predicted Y with each 1 SD increase in X)
The intercept of the standardised regression equation is always 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Explain the difference between b and β (beta)

A

b is the unstandardised regression coefficient, whereas B is the standardised regression coefficient.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Explain what question is being addressed when we test the regression slope (i.e., b or β) for significance, and which test is used for this purpose

A

When we test the regression slope for significance, we are answering the question of if the slope significantly differs from 0. the test used is a t-test.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Explain how the F ratio is calculated in regression and what question it is used to test

A

The F ratio is calculated by dividing MSregression/MSresidual (both of which are calculated from SS/df)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Explain how multiple regression extends on bivariate regression

A

Multiple regression extends the number of predictor variables that can be tested in relation to a single criterion variable, which can better predict the criterion, and account for more variance.

17
Q

Explain the different research questions that are addressed in bivariate regression and multiple regression

A

Bivariate regression addresses research questions that only have one predictor variable and one criterion variable. Multiple regression addresses research questions that have multiple predictor variables, and one criterion variable.

18
Q

Explain what R, R2, and R2adj represent in the context of multiple regression

A

In multiple regression, R represents the multiple correlation coefficient (the overall relationship between Y and all predictors). R^2 represents the coefficient of multiple determination (the variance in Y jointly accounted for by all predictors). Adjusted R^2 represents an adjusted, less biased estimate of R^2. Together they represent the overall relationship between Y and all predictors.

19
Q

Explain the differences between the three types of individual predictor correlations:
- Zero-order (Pearson’s) correlation (r) and r2
- Partial correlation (pr) and pr2
- Semi-partial correlation (sr) and sr2

A

R and by extension r^2, are unadjusted, and do not account for intercorrelations between predictors nor their shared contributions in predicting Y. Both pr and sr are adjusted. Pr^2 is the residual variance. Sr^2 is the proportion of total variance.

20
Q

Identify the effect sizes from ANOVA that are conceptually equivalent to pr2 and sr2

A

Conceptually, pr^2 is equivalent to eta squared p. Conceptually sr^2 is equivalent to eta-squared.