SEM- structural equation modelling Flashcards Preview

AQM > SEM- structural equation modelling > Flashcards

Flashcards in SEM- structural equation modelling Deck (55)
Loading flashcards...

What is path analysis?

• Path analysis is a very simple form of Structural Equation Modeling (SEM).
• We would typically use the term ‘path analysis’ when we are modeling observed variables.
• This means we have a single measure of the construct e.g. word vocabulary test.
• More often referred to as SEM when we have multiple indicators of a construct and we create latent variables
• In its most basic form, path analysis is a simple extension of multiple regression.
• Path analysis is typically used to:
o Examine the size and direction of direct and indirect effects between multiple variables
o Examine the goodness of model fit between the researcher’s hypothesised model and the observed data
o Compare the observed model fit of competing theoretical models


Software used for SEM?

AMOS: simplest program to begin with; has a graphical module which allows relatively easy specification of models


What is a mediated multiple regression?

• In a mediation model, the relationship between an IV and outcome is accounted for or ‘mediated’ by a third variable i.e. a mediator variable.
• Mediation implies a ‘causal chain’ series of relationships between the three variables i.e. IV – Mediator - DV.
• The researcher must have clear theoretical or logical grounds for choosing the mediator and IV variables.


What are the requirements for a mediation?

1. predictor (X) must predict mediator (Z)
2. mediator (Z) must predict criterion (Y)
3. predictor (X) must predict criterion (Y)
4. the X,Y relationship must ‘shrink’ in the presence of Z


How to assess mediation?

• The predictor --> outcome beta weight should be 0 (or at least nonsignificant) for full mediation when the mediator is in the model, i.e. the relationship between IV and DV should be fully accounted for by the indirect effect via the mediator.
• Researchers often make a case for partial mediation if the beta weight drops substantively but does not reach 0.
• Sobel test of the indirect effect (more on this later).


What are the advantages of SEM over running multiple ordinary least squares regressions?

• While we can run a series of OLS (ordinary least squares) regression models to examine structural path models, the analysis can become very complex when testing a large number of effects.
• Modern SEM software programs use maximum likelihood based methods to calculate effects simultaneously.
o Simpler and quicker estimation of model effects
o Obtain global model fit indices that can confirm or disconfirm whether your model fits the data.
o Encourage researcher to specify causal relations between variables beforehand
o More direct and easier tests of alternative theoretical models (model trimming and building) and their fit
o Parameter estimates are better estimated in one go if possible, rather than estimating in multiple steps as bias is introduced in unnecessary multiple step estimation.


Steps in path analysis

1. Specify the model
2. Model identification
3. Model estimation
4. Evaluate model fit
5. Interpret model effects
6. Modifying the model
• Examining alternative model


How should you specify a path model

• You should use theory and/or previous research, as well as logical relations between variables, to justify your path model
• (so this is a confirmatory rather than an exploratory technique).


Circles, squares, single headed arrows and douuble headed arrows in paths daigrams

o In path diagrams, observed variables are typically represented by squares.
o Latent (unobserved) variables are typically represented by circles (or an ellipse).
o Single-headed arrows represent causal relationships between variables.
o Curve double headed arrows represent correlations between two or more exogenous variables.


What are recursive and non-recursive models?

o Recursive models
 Models where all causal pathways are moving in the same direction i.e. effects are uni-directional.
 The most common form of model
 Always identified
o Non-recursive models
 Models where there are reciprocal relationships between variables (not referring to correlations!)
 complex to analyse
 Identification issues can be very problematic in complex nonrecursive models
 Not as common in the psychology literature


What are exogenous variables?

 These variables are considered as IVs in the model
 They have no specified predicted cause in the model, hence they have no single-headed arrow input.
 You can have multiple exogenous variables in the model; these are usually free to correlate with each other, although you can specify that they be uncorrelated.
 Correlations between two or more exogenous variables are represented by a curved double-headed arrow between variables.


What are endogenous variables?

o Endogenous variables
 These variables are considered DVs in the model  Will have a directional arrow coming in, and may also have one or more directional arrows moving away if it is a mediator variable.
 Downstream variables caused by the exogenous variables
 Each endogenous variable will also typically have an error or disturbance term associated with it.
 This reflects that there are also unmeasured and unspecified causal effects on these variables


What are latent variables?

o Latent variables
 Each endogenous variable will also typically have an error or disturbance term associated with it.
 This reflects that there are also unmeasured and unspecified causal effects on these variables
 These disturbance terms are usually modeled as latent variables, hence they are represented by circles


What is model identification?

• In SEM a model is specified, then parameters (variances and covariances of IVs and regression coefficients) for the model are estimated using sample data, and the parameters are used to produce the estimated population covariance matrix.
• However, in order to be estimated, a path model must be ‘identified’.
• This means there needs to be sufficient unique pieces of information (i.e. correlations in the observed data) to allow mathematical estimation of the model that has been specified.
• A model is said to be identified if it possible to estimate each of the unknown parameters i.e. there must be more known than unkown parameters


How can you check model identification for observed variable path models?

o Calculate all possible pathways between variables (this is number of ‘data points’ in the SEM, since data points in SEM are the number of non-redundant samples variances and co-variances).
o Simple formula: (v * v+1)/2 where v = no. of variables
o Count all of the model pathways
o Compare these two numbers
o This number that you get is actually the df of the model
o Basic rule: Maximum number of possible pathways between observed variables must equal to or exceed the number of paths specified (drawn/included) in the model. This is the same as saying that df must be more than or equal to 0.
When explaining, always state what just identified, over identified and under identified mean.


What is model estimation?

1) Once you have specified and constructed your model, you are ready to estimate your model
2) We are primarily interested in two facets of the model:
1) The direct and indirect effects between variables
2) Global model fit
(The analogue to OLS regression would be interest in: 1) Regression coefficients for individual predictors 2) Test of overall regression model fit i.e. ANOVA for R2)


What are the direct effects in a path model?

o Can test the significance of (unstandardised) direct effects
o Should consider the magnitude of direct effects, not just significance- Use past research as a guide
o Use rules of thumb e.g. Cohen: .10 small, 0.30 medium, 0.50 large.
o The path regression coefficients (are the standardised coefficients) reflect direct relations between one variable and another (controlling for the effect of any other variable also effecting the endogenous variable).
o These are the same as beta weights in normal MR We can obtain these by simply running separate OLS regression models.


What are the indirect effects in a path model?

o Effects of one variable on another variable via a mediator variable.
o In a standard one-mediator mediated regression there is one indirect effect – the effect of the IV on the DV via the mediator.
o The strength of an indirect effect is obtained by multiplying the constituent direct paths (or the two direct paths that make up the indirect path) i.e. N--> void = .34, avoid --> dep is .5. So indirect effect of n--> dep via avoid is .34*.5 = .17


How can we fully interpret indirect effects in a path model?

 Imagine an indirect effect of Neuroticism on Depression via avoid is (0.34 * 0.50) = 0.17.
 What this means is that N has a 0.34 direct effect on Avoid, but only 0.5 of this is transmitted to Dep via Avoid i.e. 0.17.
 This means we can expect an increase in Dep of 0.17 SD units for every 1 SD unit increase in N, via the effects on Avoid.
We would also mention how much the direct effect shrinks when indirect is taken is taken into account, is it a full or partial mediation?
 Can be difficult to calculate statistical tests with two or more mediators


What is the sobel test

The Sobel test is often used to test the significance of the indirect effect with one mediator
 a z test on the ratio of the unstandardised indrect effect to its standard error, only useful with fairly large samples


How do we calculate total effects in a path model?

o Total effects represent the total causal effect of one variable on another.
o This is calculated by summing all of the direct and indirect effects
o In our earlier example: the total effect of N on Dep
 = direct effect + indirect effect
 = 0.22 + 0.17 = 0.39
o Total effect of Avoid on Dep is simply -0.03, as there are no indirect effects in this pathway.
(If you have AMOS output, you also get a total effects table here)


Where are the unstandardised regression weight in AMOS output?

o Regression weights table ‘estimate’ column give you unstandardized regression weights, corresponding to ‘B’ column in spss


Where are the standardised regression weights in AMOS output?

o Standardised regression weights table gives you the values from beta column that go on the diagram. You also get a different table called standardised direct effects, which gives you the same numbers.
p values not given, they are in the regression weights table with the unstandardised weights.


Indirect effects in more complex models- tracing rules

o There are several ways N can affect Dep indirectly in this model – we need to trace these through the model
o Tracing rules:
 You cannot enter and exit a variable on an arrowhead (I think this means, for example that when you go: “navoidcog inflexdep”, you can’t then go “trait anxcog inflexavoiddep” because you would be reversing the way that you analyse the relationship between avoid and cog inflex. i.e. in the first one it’s ‘avoidcog inflex’ and in the second one it’s ‘cog inflexavoid.’


Total effects in more complex models

1 direct effect = .30
2 indirect effects
 N – Avoid – Dep: 0.34 * -0.03 = -0.01
 N – Avoid – Cog Inflex – Dep: 0.34 * 0.15 * .45 = 0.02

o The total causal effect for N on Dep is the sum of the direct and indirect effects
 Direct effect = 0.30
 Sum of indirect effects = (-0.01 + 0.02) = 0.01
 Total effect = 0.31


What is the best support for a causal path/SEM model?

4 empirical conditions must be met to support causal inference (e.g. Kline, 2005/11)
o Relationship: X should be correlated with Y
o Temporal precedence: X must precede Y in time
o Non-spuriousness: X-Y relationship should hold after controlling for other variables experimentally or statistically e.g. third variable issue
o Correct effect priority: there are no reciprocal relationships between X and Y, or reversals of this relationship

 The more well-specified a model is in terms of theory, logic etc, the more persuasive a case is made for a real model if model turns out to ‘fit’ data if direction of relationship specified a priori this strengthens the plausibility of the model. Can also specify path weights a priori.

Omitting paths from model can also help with plausibility- see parsimony


What is a parsimonious model? Why is it good?

•df is the difference between full and reduced model
•Reduced model is more simple and parsimonious
• Parsimonious models (if plausible) have several advantages:
o (i) Simplest (but sufficient) models preferred in science
 Occam's razor - ‘all other things being equal, the simplest model is the most preferred’
o (ii) Easy for a reduced model to be a statistically worse fit than full model - if survives this test of fit then more credibility as plausible model


What is meant by model fit?

o Bad model fit – model doesn’t explain data as well as other models might (e.g. a model with paths dropped/added) - refine or discard model
o Good model fit – fails to disconfirm your model, the data is well explained by the paths specified in the model.
But remember, you may have good model. But ‘fit’ is with reference to variables in your model
(i) an alternative model with different specification of paths might be even better – still worth testing alternative models
(ii) maybe there is a more complete model (more variables)


How is model fit typically assessed?

o Remember that correlation = direct+indirect+unanalysed effects; i.e. summing all effects will give original sample correlation
o Sample correlation - If all possible paths (i.e. effects) are included in model (saturated model) they will sum to original sample correlation
o Implied correlation - if only some paths estimated (reduced model), sum of effects will not automatically equal sample correlation – but give a predicted or implied correlation
o Most measures of model fit are based on the discrepancy between sample and implied correlations (residual correlations)
• If correlations from saturated model not that different from reduced model than you have a ‘good’ model


What are reduced/saturated models?

• Saturated model
o all paths are estimated in model- 0df
o the sample correlation matrix can therefore be reproduced perfectly (by adding up all effects)
• Reduced model (called ‘default’ in AMOS)
o not all paths are included (e.g. earlier example)
o implied or predicted correlations therefore usually different from sample correlations