Correlation and Regression Flashcards
(51 cards)
what is correlation a form of?
bivariate analysis
- relationship between 2 variables
focus on direction and degree
what is a linear relationship?
for every increase in x, there is also an increase in why
what are some examples of non linear relationship?
practice and performance… when learning a musical instrument, you are more likely to learn a lot more in the first year and your progress is likely to slow over time, eg
T or F? Even when there is a non linear relationship, it makes sense to use correlation measures? Why
False
As you might get a correlation of value when in fact there is a U shape relationship between the data
what are the rules of thumb on how big or small a correlation is?
small (.1 to .3)
medium (.3 to .5)
large (.5 to .7)
what is r squared? what is it used for?
the correlation coefficient, squared
when you square the correlation coefficient, this gives you an estimate of the percentage of variance that is actually accounted for by your model - how much variance does your predictor account for?
if your predictor accounts for 50% of the variance, what does this mean?
that 50% of the variation across subjects can be accounted for by the predictor you have
what is variability?
how much a given variable varies from observation to observation - eg how much height in the class varies
what is covariability?
how much two variables vary together eg, if we take the class height and weight, as height increases (or decreases) how does that impact weight? positively, negatively or no relationship? do two variables vary together or independently of each other?
what is the sum of squares used for? how is it calculated?
it calculates a rough estimate of variability…
SS = Σ(X - X(mean))^2
you take each individual’s height, and subtract the mean height from that and square it…. then sum it up for all observed numbers
to measure the variability you use ____
to measure co variability you use ____
sum of squares; sum of products
how is the sum of products calculated?
SP = Σ (X-X (mean)) x (Y - Y(mean))
when will the sum of squares be identical to the sum of products?
when both variables are identical
how do we calculate the pearson correlation coefficient?
SP
r = ———————
Square root of (SS of x by SS of y)
what is the worded formula for calculating the pearson correlation coefficient?
r = covariability of X and Y/Variability of X and Y separately
calculating a ratio
what happens if we have relatively low co-variability of X and Y compared to variability of X and Y separately?
we have a weak correlation
what can drastically influence your correlation value?
extreme scores or outliers
what is regression towards the mean?
where an extreme score on one measure tends to be followed by a less extreme score on the other measure… as extreme scores are often due to chance, it’s extremely unlikely that the other value will also be extreme, eg if there is a really really rainy day, it is likely that the following day will not be as rainy
what is an example of the regression towards the mean?
1 or 2 people might guess 10 coin flips correctly, and 1 or 2 people might correctly guess the number between 1 and 50, but it is highly unlikely to be the same people as the extreme scores of the people who got the coin flip correct are more likely to be followed by getting a value closer to the mean on the next variable
what is the null hypothesis for correlation?
that the correlation in the population is zero
what is asked when determining if the null hypothesis is to be rejected or accepted?
once r value has been calculated, we ask what is the probability of finding an r value this big if the real association in the population is zero? If this probability is small, we reject the null hypothesis
what is the degrees of freedom?
the amount of participants (N) minus 2
how is spearman’s correlation used? when is it used?
convert the data to ranks before calculating correlations…
used when asking the question are values that are high on one variable also high on the other variable?
why would you use spearman’s correlation?
when you have non linear data… and you control for or eliminate outliers as values such as 15, 20, 23232 becomes values of 1, 2, 3.