Correlation & Multiple Regression Flashcards

1
Q

What is correlation?

A

An association or dependency between two independently observed variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What type of graph is used for correlations?

A

scatterplots

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

A Pearson correlation coefficient of -1.0 means X and Y are____

A

exactly inverse to one another

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Which measure of association should be used when both variables are interval/ratio e.g temperature ?

A

Pearson’s coefficient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Which measure of association should be used when both variables are ordinal (rank) ?

A

Spearman’s/ Kendall’s rank coefficient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Which measure of association should be used when both variables are true dichotomous e.g male/female or yes/no ?

A

Phi coefficient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Point-biserial coefficient is used when one variable is ____ and the other variable is _____

A

true dichotomy and interval

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

If there are more than 2 variables and you want to assess the relationship of one pair after accounting for another pair. What type of correlation is this?

A

partial correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is multiple linear regression?

A

Similar concept to correlation.

It describes the relationship between one or more predictor variables and a single criterion variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

The goal of a regression model is finding the best fit between the model and the observation. This is done by adjusting the value of the _____________ until the prediction error is minimised.

A

regression coefficient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the residual sum of squares?

A

A statistical technique used to measure the amount of variance in a data set that is not explained by a regression model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

You can assess the goodness of fit of a regression model by using a multiple correlation coefficient (R). What is this a correlation between?

A

A correlation between the predicted values and the observed values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

You can assess the goodness of fit of a regression model by using a coefficient of determination (R^2). This is simply the proportion of ______ explained by the ______.

A

The proportion of variance measured by the regression model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

F-ratios in ANOVA can be used to assess the goodness of fit of the linear regression model. What does a high F-ratio indicate?

A

a good model, decreased prediction error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

A simultaneous (standard) multiple regression approach is used when ____

A

no a priori model is assumed and all predictor variables are fit together

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

A stepwise approach is not a good approach because_____

A

it will always overfit the data

17
Q

If a relationship is already known but we want to account for it. What multiple regression approach is taken?

A

hierarchical

18
Q

Cook’s distance measures the extremity of an _____; values greater than 1 are cause for concern.

A

outlier

19
Q

Define scedasticity.

A

scedasticity refers to the distribution of the residual error relative to the predictor variable

20
Q

Multiple linear regression assumes homo_______

A

homoscedasticity- residuals stay relatively constant over the range of the predictor variable

21
Q

________ refers to a high similarity between two or more variables.

A

multicollinearity

22
Q

Singularity refers to a redundant variable. This typically occurs when one variable….

A

is a combination of two or more other variables. e.g intelligence scales

23
Q

What are we trying to detect if we look for high multivariate correlations?

A

singularity

24
Q

What are we trying to detect if we look for high bivariate correlations?

A

multicollinearity

25
Q

Small range of the predictor variable restricts _________

A

statistical power