lecture 8 - correlation Flashcards
(22 cards)
what data types are on each axis?
ratio or interval
what variables are on each axis?
y - dependent
x - independent
what happens if one factors increases?
if one increases the other does too:
COVARIANCE
Co-vary
or vary together
what is correlation?
correlation
+ or - and strength
scale independent of the variables, between -1 and 1
standardised - think z-score
what is covariance?
+ or -
scaled to the variable
no upper or lower limit
not standarised
when are two variables said to be co-related?
“when the variation of the one is accompanied on the average buy more or less vacation of the other, and in the same direction”
what does Pearson’s correlation coefficient describe?
correlation coefficient (r) quantifies the strength and direction of a linear association between two variables
how are strengths classified for negative correlation?
(-1) - perfect
(-0.7) - (-0.9) - strong
(-0.4) - (-0.6) - moderate
(-0.1) - (-0.3) - weak
0 - zero
how are strengths classified for positive correlation?
(1) - perfect
(0.7) - (0.9) - strong
(0.4) - (0.6) - moderate
(0.1) - (0.3) - weak
0 - zero
are correlation and causation the same?
- correlation does not = causation
- third or confounding variable problem
- direction of causality
how are correlations often displayed?
correlation matrix
what are the assumptions?
- ratio/interval
- normally distributed
- independent
- linear
- homoscedasticity (variation of data along line of best fit should not change (variation on each side should be the same))
- different from t-test
what is the null hypothesis for correlation?
H0 - there is no association between … and ….
any association found is simply a result of sampling error
what is the alternation hypothesis for correlation?
H1 - there is an association between the … and ….
which bits do you look at in the table?
Pearson’s correlation coefficient - direction and strength
sig. (2-tailed) - significance
N - sample size
what is the coefficient of determination?
correlation - they co-very, share a variance
of r = 0.106 and r^2 = 0.011
coefficient of determination - 1.1%
residual variance - 98.9%
what does the coefficient of determination mean?
1.1% of variability in total triathlon time can be explained by the time in the swimming leg
how do we report correlation
r(degrees of freedom (N-2)) = the r statistic, p=p value
when do we reject H0 in one and two-tailed tests?
- with a two-tailed test we would reject H0 if we found a positive or negative association
- with a one-tailed test, we only reject H0 if the association is in the direction that we expected and half the p value in the output
what is Spearman’s rho (Rs) rank correlation?
- non parametric equivalent of Pearson’s r
- one or both variables are ordinal
- ratio/interval but the parametric assumptions have been violated/breached
- spearman’s rho calculates the ranked scores for each variable and considers the association between the ranks
- minimum sample size of 20
what is Kendall’s Tau test?
nonparametric and one or both variables are ordinal or ratio
- useful with N<20
- can deal with large number of tied ranks in the data