Correlation and Linear Regression Flashcards Preview

Quantitative Methods in Health > Correlation and Linear Regression > Flashcards

Flashcards in Correlation and Linear Regression Deck (41)
Loading flashcards...
1

What does a Pearson Correlation Coefficient measure?

The degree of linear association between two numerical variables

2

What is the formula to find the Pearson Correlation Coefficient?

r=(Σ(x-x̄)-(y-ȳ))/√Σ(x-x̄)2Σ​(y-ȳ)2)

3

What are the units for the correlation coefficient?

It has no units

4

What is the range of possible values for the correlation coefficient?

-1

5

If two variables x and y are positively correlated, how will the data appear?

Large/small values of x will be associated with large/small values of y

6

If two variables x and y are negatively correlated, how will the data appear?

Large/small values of x will be associated with small/large values of y

7

If two variables are significantly correlated, can we conclude that one must be the cause of the other?

No

8

What two things can an equation of a ‘linear’ association line used to make predictions called?

Regression equation

Regression line

9

What does a least squares regression?

A least squares method minimises the sum of squares of residuals, which are the vertical distances from the line to points.

10

What is the formula for the estimated equation of the regression line?

ŷ=b0-b1x

Where b0 = the y-intercept

and b1x = the slope

11

What is the formula for finding b1?

b1=(Σ[(xi-x̄ )(yi-ȳ)])/(Σ(xi-2))

12

What is the formula for finding b0?

b0=ȳ-b1

13

What is b1?

The change in y across 1x 

14

What is b0?

The y-intercept

15

What does the y-intercept, btell us?

The value of y at x=0

16

What is the slope b1?

The rate of change of y with respect to x.

17

What does the slope b1 tell us?

How much y will change when x increases by one unit.

18

How can we determine how well a regression line fits the data?

Using the coefficient of determination R2

19

What is the formula for the coefficient of determination R2?

R2=(Sxy)2/SxxSyy

20

What is the coefficient of determination R2 in terms of the correlation coefficient r?

R2 is the square of the correlation coefficient r

21

How do we determine using the coefficient of determination R2, how well the line fits the data?

The closer the value of R2 is to 1, the better the line fits the data.

22

What does the value of the coefficient of determination R2 tell us?

How much of variability in the dependent y‐variable can be explained by the independent x‐variable.

23

If the value of R2 is >90%, what does this tell us about the strength of linear association and thus the quality of the simple linear regression model?

The strength of linear association is very strong and thus the quality of the simple linear regression model is excellent

24

If the value of R2 is 75-90%, what does this tell us about the strength of linear association and thus the quality of the simple linear regression model?

The strength of linear association is strong and thus the quality of the simple linear regression model is very good

25

If the value of R2 is 50-75%, what does this tell us about the strength of linear association and thus the quality of the simple linear regression model?

The strength of linear association is reasonable and thus the quality of the simple linear regression model is good

26

If the value of R2 is 25-50%, what does this tell us about the strength of linear association and thus the quality of the simple linear regression model?

The strength of linear association is weak and thus the quality of the simple linear regression model is weak

27

If the value of R2 is <25%, what does this tell us about the strength of linear association and thus the quality of the simple linear regression model?

The strength of linear association is very weak and thus the quality of the simple linear regression model is poor

28

The 4 assumptions are made in a linear regression?

The observations are independent/Repeated observations on the same individual are not allowed

The relationship is linear

The response varies  Normally about the population regression line

The standard deviation (or variance) of the response about the population line is the same everywhere.

29

What is the problem with this data set?

The data is not independent

30

What is the problem with this data set?

There is non-constant variance