Module 2 Flashcards
(29 cards)
What is correlation?
Correlation is when a change in one variable is associated with change in another variable
What is multiple regression?
Multiple regression is about trying to predict scores based on what we know about other (predictor) variables
What type of questions do moderation analyses attempt to answer?
What/when questions
eg: “under what circumstances?”, “for what type of people?”, “when does the effect occur?”
What type of questions do mediation analyses attempt to answer?
How/why questions
eg: “How does X influence Y”, “Why does X influence Y?”
Provide 3 examples of moderation questions.
“Is the relationship between attitude towards university and where a student sits influenced by the age of the student?” - age is an inherent aspect of the individual
“Does playing violent video games for more than 6 hours a week make people more aggressive?” - 6 hours a week is a circumstance
“If employee satisfaction is high, is job turnover reduced for both male and female employees?”
Provide 3 examples of mediation questions.
“Is the relationship between attitude to university and where students sit explained by IQ?”
“Does playing violent video games that involve realistic interpersonal violence make people more aggressive?”
“If employee satisfaction is high amongst workers with high autonomy, is job turnover reduced?”
Define moderation
Moderation looks at how a third variable changes the relationship between a predictor variable and outcome variable, based on the interaction between the predictor variable and third (moderator) variable.
You can conceptualise it as a multiple regression with 3 independent variables: the original predictor variable, the moderator variable and their interaction.
Why do we need to centre variables?
To avoid multicollinearity
What should we do after we find we have a significant interaction (moderation) effect?
Perform a simple slopes analysis to find exactly where the interaction is occurring
What is the role of a mediator variable?
A mediator variable explains part or all of the relationship between two variables.
What is the role of a mediator variable?
A mediator variable explains part or all of the relationship between two variables.
In mediation, what are the 2 antecedent variables? What are the 2 consequent variables?
Antecedent = X and M Consequent = M and Y
Distinguish between direct effect, total effect, direct pathway and indirect pathway
Direct effect = c’ (X > M > Y)
Total effect = c (X > Y)
Direct pathway = X > Y
Indirect pathway = X > M > Y
How will c relate to c’ if partial mediation has occurred? How will they relate if perfect mediation has occurred?
Partial mediation - c’ will be smaller than c
Perfect mediation - c’ = 0
What are the 4 requirements to that need to be fulfilled to confirm mediation has occurred? Which of them is still debated? Why?
1) There is a significant relationship between X and Y
2) There is a significant relationship between X and M
3) M still predicts Y after controlling for X
4) The strength of relationship between X and Y is reduced when M is in the equation
1) is still debated because alone it represents a correlation and in the same way that correlation doesn’t equal causation, lack of correlation doesn’t mean there is no causation. Thus, it shouldn’t be necessary for there to be a relationship between X and Y
What are the 4 requirements to that need to be fulfilled to confirm mediation has occurred? Which of them is still debated? Why?
1) There is a significant relationship between X and Y
2) There is a significant relationship between X and M
3) M still predicts Y after controlling for X
4) The strength of relationship between X and Y is reduced when M is in the equation
1) is still debated because alone it represents a correlation and in the same way that correlation doesn’t equal causation, lack of correlation doesn’t mean there is no causation. Thus, it shouldn’t be necessary for there to be a relationship between X and Y
What are the 5 steps of data screening and assumption testing?
1) Check data entry for accuracy
2) Evaluate missing data
3) Outliers and normality
4) Linearity, homoscedasticity, and independence
5) Multicollinearity and singularity
What are the 2 different views on how big the sample size should be?
Tabachnick and Fidell = 50 + (8 x IV)
Stevens = 15 x IV
What would our skewness be if our distribution was normal?
Less than 1
What are the two plots we inspect for normality and outliers?
Boxplots and histograms
What are the two statistical tests of normality? How do we use them to know if our distribution is normal?
K-S and Shapiro-Wilk tests - if they are non-significant, this is good and we have normality. If they are significant, this is a problem and we don’t have normality.
However, minor breaches in normality can often be overlooked if we have a large enough sample size.
In terms of the output, what are we looking for to determine if we have any univariate outliers?
‘Casewise Diagnostics’
What two statistics do we look at to determine if we have any multivariate outliers?
Mahal.’s distances and Cook’s distance - if maximum Mahal.’s is less than 13.24, then we’re all good. Cook’s needs to be less than 1 and tells us about any overly influential cases.
Which two plots tell us about homoscedasticity?
P-P plots and scatterplots.
P-P: points should be hugging diagonal line
Scatterplot: points should be evenly distributed around 0