Correlation Flashcards
What is correlation?
a statistic that measures the relationship between two variables
What are the different characteristics of correlation?
- direction (positive or negative)
- form (linear or non-linear)
- strength or consistency (magnitude)
What is the form of relationship of correlation?
do the data fit a linear or non-linear form
What is the consistency or strength of the relationship?
measured by the numerical value of the correlation
What is higher absolute value?
closer to 1.00 means that it is stronger, more consistent relationship between variables
What is perfect correlation?
identified by a correlation of 1.00
What are the different components of a scatterplot?
- direction (positive or negative)
- strength (weak, moderate, strong)
- linearity (linear or nonlinear)
What does the value of r^2 mean?
the coefficient of determination which measures the proportion of variability in one variable that can be determined from the relationship with the other variable
What are outliers?
an individual with X and/or Y values that are substantially different from the values obtained for the other individuals in the data set
What are the different types of correlation?
- Pearson
- Spearman rho
- Kendall’s Tau
- Point biserial
- Biserial
-Phi
When do you use Pearson?
both variables are continuous ( are least interval or ratio)
When do you use Spearman rho?
- skewed data, non-linear relationships
- ordinal data, the “Pearson of ranked data”
When do you use Kendall’s Tau?
- ordinal data, better than Spearman for small samples
- better when there are many ties among ranks
When do you use Point biserial?
continuous variable (interval or ratio data) and natural binary variables (ex: yes/no coded as 0 and 1)
When do you use biserial?
continuous variable (interval or ratio data) and a binary variable with underlying continuity (e.g., test score converted to pass/fail)
When do you use Phi?
two binary (two categorical/nominal) variables
What is the Pearson Correlation?
measures the degree and direction of the linear relationship between two continuous variables
What does “r” represent?
correlation as a sample statistic
What does “p” (pho) represent?
correlation as a population parameter
What is the sum of products (SP)?
- determines whether a correlation coefficient is positive or negative
- measures the amount of covariability between two variables
What will happen the larger the covariance?
the closer the data points will fall to the regression line
What happens when all data points for X and Y fall exactly on a regression line?
the covariance equals the total variance, making the formula for r equal +1.0 or -1.0
What is the denominator of the formula for r?
the total variance
What the numerator of the formula for r?
the covariance which is the proportion of total variance that is shared by X and Y