Lecture 8 Flashcards
(22 cards)
correlation
- tests whether two continuous variables have overlapping variance (aka do they have a relationship)
- captures relationships, whether change in one continuous variable is associated with change
in another continuous variable
overlapping variance
- how the distribution of scores compare for two populations
of data. - Can you predict the variability of one set of scores from another set of scores?
perfect relationship
r values can be between -1 (perfect negative relationship) to +1 (perfect positive relationship), and
a value of 0 indicates no relationship
covariance
whether two variables co-vary with one another
r^2
- effect size
- % of variance explained by the relationship
- simply tells you the percent of the variance overlapping
between the two variables
unexplained variance
- 1 – r2
- nonoverlapping variance between the X and Y variables
what do you do if r is significant/not significant
- if r is significant, you can predict a Y value from the new X value
- if r is not significant, any value of X the best estimate is the mean of Y so you can only use its average
regression
- predicting a
value of Y from a new value of X based on your model of the relationship - need line of best fit
ŷ (y-hat)
- a predicted value of y from a value of x
y-intercept
y value when x=0
linear regression
- a function for the line that
runs through the plot - used to predict a y value from a new x value
third variable problem
third unmeasured variable that can directly cause X and Y,
moderator variable
making X and Y
correlated but not causally related
mediator variable
where X is indirectly causing Y through the third variable
directionality
- we cannot tell which variable is causing the change in the
other. - why we use predictor and criterion for correlation and not IV and DV
issues with r
- influenced by outliers: bivariate outlier can cause a single-score driven correlation, outlier can also reverse the correlation
- not an unbiased estimator: tends to overestimate ρ (rho – the
population correlation)
unbiased estimator
- equally likely to underestimate and
overestimate the true population parameter - this means that the average of repeated sets of
sample estimates would be expected to be equal to the population parameter
how to fix r overestimating ρ
radjusted statistic, which modifies r to account for the bias
and likely overestimation.
extrapolation
- vwhen you predict a ŷ score from an X score outside the range of the other X scores
- leads to overgeneralization
- some extrapolation safe, but the more out of range you go the more likely it is to cause overgeneralization
overgeneralization
using a small section of a scatter plot to try and predict a larger dataset
interpolation
predicting a ŷ score within the range of the other x scores
- avoids overgeneralization
dummy coding
forcing a categorical
variable to act like a continuous variable. Here, we can assign the values of 1 or 2 to the group names