Correlation and Linear Regression Flashcards
(38 cards)
What correlation and linear regressions were discussed in the lecture?
- Pearson’s correlation/Spearman rank correlation
- Linear regression
When do you use Pearson’s correlation and Spearman rank correlation?
- Pearson = normally distributed variables
- Spearman = non-normally distributed variables
What does correlation quantify?
Quantifies the strength and direction of association between two numerical variables
Should correlations be interpreted with caution?
Yes, correlations should be interpreted with great care because they do not necessarily indicate causation.
What are some possible reasons for correlations between variables?
Correlations between variables can result from:
- a causal relationship
- shared dependency on some third unmeasured variable
- coincidence
Why are correlated time series unreliable indicators of causal relationships?
Correlated time series are unreliable indicators of causal relationships because over time a variable can only follow four possible trajectories (steady state, increase, decrease, or fluctuation), and there are bound to be many coincidences.
What is Pearson’s product-moment correlation (r)?
Pearson’s product-moment correlation (r) is a statistical method that compares two numerical continuous variables and ranges in value from -1 through 0 to +1.
What rules must be followed when using Pearson’s product-moment correlation?
- the first action should be to draw a scatterplot
- both variables must be continuous & normally distributed (check for normality)
- if these assumptions are not met, a Spearman’s rank-order correlation (non-parametric correlation) should be used
What is H0 in Pearson’s product-moment correlation (r)?
H0 in Pearson’s product-moment correlation (r) states that the two variables are not correlated.
What is H1 in Pearson’s product-moment correlation (r)?
H1 in Pearson’s product-moment correlation (r) states that the two variables are correlated.
What are the two ways to calculate Pearson’s product-moment correlation (r) in Excel?
The two ways to calculate Pearson’s product-moment correlation (r) in Excel are:
- through the Analysis Toolpak (“Correlation”)
- by using the function key (“=CORREL”).
What statement is included if a Pearson correlation test is carried out?
Reject H0: “There was a significant correlation between ‘variable 1’ and ‘variable 2’ (r = ___ , df = __, p < 0.05).”
Accept H0: “There was no significant correlation between ‘variable 1’ and ‘variable 2’ (r = ___ , df = __, p > 0.05).”
NOTE: do not state reject/accept H0
What are 2 ways we can do the Pearson correlation test in R studio?
> cor()
cor.test()
(slide 7)
How much of the variation in one variable can be explained by the other variable if we express the correlation coefficient as r^2?
r^2 indicates the proportion of variance in the dependent variable that can be explained by the independent variable.
What caution should be taken when interpreting a significant correlation coefficient with a big sample size?
A big sample size can lead to a highly significant correlation but may explain a very small percentage of the variation. Therefore, it is important to carefully evaluate the practical significance of the relationship between the variables.
What method can be used to model and explain the relationship between two variables once a significant correlation is found?
Regression analysis, such as linear regression, can be used to model and explain the relationship between two variables once a significant correlation is found.
What is Spearman’s rank-order correlation (rs)?
Spearman’s rank-order correlation (rs) is a non-parametric statistical measure that describes the strength and direction of the monotonic relationship between two variables when the data is ordinal or not normally distributed.
How is the rank-order correlation coefficient calculated in Spearman’s method?
The rank-order correlation coefficient is based on comparing the rank order of the two variables. It ranks the data in each variable and then calculates the Pearson correlation coefficient on the ranks.
How can the RANK function be used in Excel to calculate the rank-order correlation coefficient?
The RANK function can be used in Excel to rank the data in each variable. Then, the Analysis Toolpak or the function key “Correl” can be used to calculate the rank-order correlation coefficient.
What do you include if a Spearman’s rank-order correlation test is carried out?
Reject H0: “There was a significant correlation between ‘variable 1’ and ‘variable 2’ (r = ___ , df = __, p < 0.05).”
Accept H0: “There was no significant correlation between ‘variable 1’ and ‘variable 2’ (r = ___ , df = __, p > 0.05).”
What are 2 ways we can do the Spearman’s rank-order correlation test in R studio?
> cor(comp.dat$Female,comp.dat$Male, method=”spearman”)
> cor.test(comp.dat$Female,comp.dat$Male , method =”spearman”)
(slide 10)
What is casual inference?
The process of drawing conclusions about the causal relationship between two variables, based on the observed data.
What is linear regression?
Linear regression is a simple way of modeling cause and effect.
What is the cause in linear regression?
The cause in linear regression is the independent or predictor variable.