Chapter 7 Flashcards
(21 cards)
relationship between correlation and prediction
- if two scores are correlated, we can do a better job of making a prediction of a person’s score on one variable based on the other variable
- The greater the correlation, the more accurate our prediction will be
how does correlation help when it comes to tests?
- Can establish reliability (do people get similar scores if they re-take it?)
- Can establish validity (does it measure what it’s supposed to?)
bivariate distribution
distribution that shows the relation between 2 variables
positive vs. negative vs. no correlation
- Positive correlation: linear relationship; high scores on one variable are paired with high scores on the other (and vice versa)
- Negative correlation: linear relationship; high scores on one variable are paired with low scores on the other (and vice versa)
- No correlation: both high and low values on the first variable are equally paired with high and low values on the second variable
values of r
range from –1 (perfect negative correlation) to 0 (no correlation) to +1 (perfect positive correlation)
what correlations are not
- Causation (variable x doesn’t cause variation in variable y)
- Percentages (a .50 correlation does not mean they’re 50% correlated, and it is not twice as strong as a .25 correlation)
what do correlations really mean?
- A +1 correlation indicates that 100% of the scores on the second variable will be above the mean (or below the mean on a –1 correlation)
- A 0 correlation indicates that 50% of the scores on the second variable will be above the mean and 50% will be below
- The correlation coefficient is really an index of how similar paired z scores are
n
in formulas for correlation coefficient, note that n stands for the number of pairs of scores
Spearman’s rank-order correlation coefficient (rs)
- Used when aspects of a measure have been rank-ordered (ie. Employees rank best aspects of the job, employers rank aspects based on what they think employees would rank, and we want to calculate correlation between the rankings)
- Also used when measures are in score form
- Used when n is relatively small
why can’t correlation equal causation?
- X may cause y, but…
- Y might be causing X
- A third variable may be influencing X and Y
- A complex set of interrelated variables may be influencing X and Y
- X and Y might influence each other
r and transformations
- r is unaffected by any positive linear transformations of raw scores
- Ex. You could add/subtract 100 from all scores or multiply/divide all scores by 100, and the distribution of scores won’t change -> r will remain the same
- The rankings in Spearman’s, however, will change
- r will also remain the same whether it’s computed using raw scores, z-scores, etc. (mean changes and sd may change, but r won’t)
- exponents, square roots, and non-linear transformations will change shape of distribution and therefore, correlation
hugging principle
the more closely the scores hug the line of best fit, the higher the value of r
characteristics of r
- used for linear relationships only
- R is sensitive to the range of talent (variability) in the distribution of scores (Ie. If we restrict the range of scores, the correlation will likely change)
- R is subject to sampling variation (r may be different if you’d gotten a different sample from the population) -> R will fluctuate more from sample to sample when the samples are smaller
- Pooling samples can change the correlation depending on where the samples lie relative to each other -> If pooled samples hug regression line, correlation increases (and vice versa)
- There is no such thing as “the” correlation coefficient for variables
r
- Pearson product-moment coefficient of correlation
- sample statistic -> population parameter equivalent = rho
when correlation is 0 regardless of score on 1 variable, what’s the best prediction for the other score?
the mean
types of correlations on a scatterplot
- fencing sword: perfect correlation (1.00)
- hot dog: high correlation (0.7-0.8)
- football: moderate correlation (0.5)
- soccer ball: no correlation (0)
does a 0 correlation rule out the possibility of causation?
No
r as an inferential statistic
- r is a consistent estimator -> as size of random sample increases, the absolute difference between r and parameter (rho) decreases
- r is not an unbiased estimate of rho -> its expected value is slightly less than rho (but the amount is negligible unless n is quite small)
effect of measurement error on r
- true relationship between 2 variables will be stronger than the observed relationship because of measurement error
- the greater the measurement error (and lower the reliability), the lower the value of r (lower correlation)
correlation, range, and reliability
- lower range of scores = higher reliability
- tests with smaller range better correlate with itself if the same person is taking it repeatedly -> if a test can’t correlate with itself, it won’t correlate with anything else
effect of heterogeneity on correlation
- value of r influenced by heterogeneity (dissimilarity) of sample
- more homogenous -> lower value of correlation coefficient (and vice versa)
- restricting range of scores will diminish correlation coefficient
- between groups variance can inflate, decrease, or change the correlation coefficient