4 introduction to mediation analysis Flashcards
What is mediation analysis?
Mediation Analysis relies on the principles of regression to investigate if the relationship between variable X and variable Y is in any way mediated by a third variable (M).
what does X do in the context of Y and how does an underlying mechanism M interact?
if a change in variable X leads to a change in a Mediator variable, which subsequently changes our outcome, Y variable.
mediator is itself affected by X or Y
What is the difference between a mediator variable and confounding variables?
when confound -> no direct effect of X on Y
confound both influences X and Y
no X-M-Y
Give to examples of possible mediator paths.
e.g. A therapeutic method (X) might affect symptoms experienced after the termination of therapy (Y) because the method influences how people interpret
negative events that occur in life (M), and those interpretations then influence the extent to which symptoms are manifested.
e.g. traumatic experiences (X) might negatively influence happiness one gets from interpersonal interactions (Y) because traumatic experiences result in the manifestation of certain behaviors that others find uncomfortable to witness (M), and this in turn produces less pleasant interactions
What are the pathways of a simple mediaton model?
two pathways:
indirect effect through M
direct effect X on Y
Does M have a causal influence on Y?
Yes, this causation causes the variation in Y
however, the causal influence does not eliminate the association between X and Y
M = mediator variable, itermediary variable, surrogate variable, intermediate endpoint
Do X and Y need to be associated for a possible mediation?
“lack of correlation does not disprove causation”
“correlation is neither a necessary nor a sufficient condition of causality”
-> no longer a precondition that X and Y have simple association
EXAMPLE: Consider a scenario where a new educational program (X) is designed to improve students’ test scores (Y) by increasing their motivation (M). If the program doesn’t directly improve test scores but significantly boosts motivation, which in turn leads to better scores, a direct correlation between X (program) and Y (scores) might be weak or absent. However, the program still has a causal effect on the scores through the mediator (motivation).
What if X and M interact with each other? Does it change the statistical analysis?
if effect of M on Y is not straightforward
-> changes depending on X
-> this needs to be accounted for
-> include an interaction term XM (like in moderation analysis)
-> coefficient b needs to be reconsidered
-> direct effect of X on Y is affected
-> there is no longer a simple direct effect, because this changes depending on M
(key difference of mediation to moderation analysis!)
Should there be testing for a possible interaction XM?
No
selective testing
evidence-based decision
no reason for prioritisation
-> equal possibility for correlations!
overfitting a model is unnecessary
What is sufficient to conclude an indirect effect/mediation of X-M-Y?
A rejection of the null hypothesis that the indirect effect is zero (or an interval estimate that doesn’t include zero) is sufficient to support a claim of
mediation of the effect of X on Y through M.
tests of significance for the individual paths a and b are not required to determine whether M mediates the effect of X on Y, contrary to the causal steps logic which requires that both a and b are statistically significant.
Indeed, one does not even need to establish that the total effect of X as quantified by c is different from zero, since the size of c does not determine or constrain the size of ab.
⇒ Rather, all that matters is whether ab is different from zero by some kind of inferential standard such as a null hypothesis test or confidence interval.
What are three principles of mediation analysis?
- empirical claims should be based on a quantificaiton of the effect most directly relevant to that claim
- if ab quantifies the movement of Y by X through M, measure that
not a and b - it cannot be said, that if a and b are different from zero that ab is as well
- if ab quantifies the movement of Y by X through M, measure that
- a claim should be based on as few inferential tests as required in order to support it
- fallible by nature
- why require three, when you can do one for ab
- convey information about the uncertainty attached to estimates of quantities
- dichotomous decision of M
What is evidence of an existing mediation effect?
⇒ if the effect of X on Y when M is held constant (coefficient c’ in equation (3), called the direct effect of X) is closer to zero than is X’s effect without controlling for M (coefficient c in equation (1), the total effect of X), then M can be deemed a mediator of X’s effect on Y.
⇒ if M is held constant, the magnitude of the direct effect of X on Y diminishes
What is partial mediation, what is complete mediation?
partial mediation = patterns of findings where mediation is established in the presence of significant total effect of X and direct effect of X (c´) is different from zero
effect of X-Y is not fully explained by X-M-Y
complete/full mediation = all of the effect of X on Y is carried through the mediation process, meaning ab=c and c´=0
Which two linear models are required for a mediation analysis?
see notes.
M = im + aX + em
Y = iy + c´X + bM + ey
a = X on M
b = M on Y
c´= X on Y
What is OLS regression analysis?
fundamental statistical method used to estimate the relationships between a dependent variable and one or more independent variables
What is the linear equation that best predicts the dependent variable based on the independent variables?
What are some assumptions the OLS regression analysis makes?
- Linearity: The relationship between the independent and dependent variables is linear.
- Independence: The residuals (errors) are independent of each other.
- Homoscedasticity: The variance of the error terms is constant across all levels of the independent variables.
- Normality: The residuals are normally distributed (particularly important for hypothesis testing regarding coefficients).
What is the direct effect of X on Y?
c´ = adjusted mean difference
two cases that differ by one unit on X but are equal on M are estimated to differ c´ units on Y
-> adjusted for M (held constant)
Why is M held constant in the estimation of the direct effect of X on Y?
Keeping M constant (or controlling for M) ensures that the direct effect of X on Y is isolated. This way, we can see how X influences Y directly, not through its effect on M. It’s like holding all other variables steady to focus solely on the relationship between X and Y.
Does X have to be dichotomous in a simple mediation analysis?
In a simple mediation model, X can be any of the following:
- Continuous: For instance, X could represent hours of study, dosage of a medication, or levels of stress, where X takes on a range of values.
- Dichotomous: This is where X has two categories, like the examples mentioned above.
- Categorical with More than Two Levels: For example, X could represent types of diets (vegetarian, vegan, omnivore) or levels of education (high school, bachelor’s, master’s, PhD).
When X is continuous, the mediation effect explains how a change in X (e.g., an increase in one unit of X) is associated with a change in Y, mediated by M.
When X is categorical, the mediation effect explains how being in one category of X compared to another (or others) is associated with changes in Y, mediated by M.
What is the indirect effect of X on M on Y?
the combined effect of X on M and M on Y
ab
a = how much do two cases that differ by one unit on X, differ on M?
b = how much do two cases, that differ by one unit on M but are equal on X, differ on Y?
What is the total effect of X on Y?
c´+ ab
the combined effect of X on M and M on Y added to the effect of X on Y
Y = iy + c´X + bM + ey
Y = (it + bim) + (ab + c´)X + (ey + bem)
What would mediation analysis look like if multiple covariate are included?
Y = iy + cX + c1U1 + c2U2 + …
U = covariates
multiple independent variables, that all have a direct effect cx on Y
all Us are mediated by M
so there is multiple indirect effects
M solely influences Y
(only one b, but multiple a)
What is the concept of epiphenomenal associations in the context of multiple mediator models?
epiphenomenal = explanation of associated between two variables
refers to a phenomenon that occurs alongside or in parallel to another process but does not directly influence or contribute to the primary process
Applied to mediation analysis, if your proposed mediator M1 is not actually mediating the effect of X on Y yet is correlated with M2, which is a mediator of the effect of X on Y, a mediation analysis with M1 but not M2 in the model may nevertheless reveal a significant indirect effect of X on Y through M1.
What are recommendations to account for these epiphenomenal relations?
investigators interested in mediation through more than one mediator do so by estimating all the indirect effects in one multiple mediator model
→ maximizing correspondence between theory and model
→ one indirect effect may be epiphenomenal (explanation for association between two variables, merely correlated, no causation)
→ it is possible to compare the size of indirect effects through different mediators
How can statistical inference be made from this simple mediation model?
before: c = c´ + ab
→ sample specific instantiations
true values: tc, tc´, tatb
→ associations between variables in data available
generalisability