SEM Flashcards

1
Q

What is path analysis typically used for?

A

Examine the size and direction of direct and indirect effects between multiple variables

Examine the goodness of model fit between the researchers hypothesised model and the observed data

Compare the observed model fit of competing theoretical models

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Path analysis is what to structural equation modelling?

A

Path analysis is a very simple form of structural equation modelling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

When typically is the term ‘path analysis’ used?

A

When we are modelling observed variables

This means we have a single measure of the construct e.g. Word vocabulary test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

When is the term SEM used?

A

When we have multiple indicators of a construct and we create latent variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How is confirmatory factor analysis used in SEM?

A

Confirmatory factor analysis is used to create a measurement model

In SEM we then examine the relationship between these latent variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Outline a full SEM model

A

A full SEM is simply a combination of a measurement model (confirmatory factor analysis) and a structural model (path analysis)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Give an example of a simple path model?

A

A mediated regression is a simple path model

In a mediated model, the relationship between an iv and out on is accounted for or ‘mediated’ by a third variable

Mediation implies a causal chain series of relationships between the three variables. (The researcher must have clear theoretical or logical grounds for choosing the mediator and iv variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the requirements for mediation?

A
  1. Predictor (X) must predict mediator (Z)
  2. Mediator (Z) must predict criterion (Y)
  3. Predictor (X) must predict criterion (Y)
  4. The X and Y relationship must shrink in the presence of (Z)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

When assessing a mediation effect of the relationship between the predictor and the criterion shrinks (beta weight gets smaller) but is still sig. What does this mean?

A

Possibly partial mediation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What needs to happen in order to conclude that full mediation has occurred?

A

The x and y beta weight should be 0 (or at least non sig.) for full mediation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

When do researchers argue for partial mediation?

A

If the beta weight drops substantively but does not reach 0

Sobel test of the indirect effects

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

State the 6 steps that should be undertaken when conducting a path analysis

A

Specify the model

Model identification

Model estimation

Interpret model effects

Evaluate model fit

Modifying the model (examining alternative models)

(So If Emma Interviewed Everyone’s Mum)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What arethe advantages of running a SEM over multiple ordinal least squares regressions?

A

When testing a large number of effects the analysis of multiple regressions can become very complex and SEM use maximum likelihood based methods to calculate the effects simultaneously.

Advantages of path analysis

  • simpler and quicker estimation of model effects
  • obtain global model fit indices
  • encourages researcher to specify causal relationships between variables beforehand
  • more direct and easier to tests of alternative theoretical models and their fit
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

The first step in path analysis is to specify the model, how should this be done?

A

Using theory and/or previous research, as well as logical relations between variables, to justify your path model:

The path model can then be drawn using a path diagram

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

In a path diagram what are typically represented by squares?

A

Observed variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

In a path diagram what are typically represented by circles or ellipses?

A

Latent (unobserved) variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What do single- headed arrows represent in a path diagram?

A

Causal relationships between variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is an exogenous variable?

A

These variables are considered as IV’s in the model

They have no specified predicted cause in the model, genes they have no single-headed arrow going into them

You can have multiple exogenous variables in the model; these are usually free to correlate with rah other, although you can specify that they be uncorrelated (correlations between two or more exogenous variables are represented by a double headed arrow between variables)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

In a path diagram what do a double-headed arrow represent?

A

Correlations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What are endogenous variables?

A

These variables are considered DV’s in the model

They will have a directional arrow coming into them & may also have one or more directional arrows moving away if it is a mediator variable

Basically these are downstream variables caused by exogenous variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Which variables typically have an error or disturbance term associated with them?

A

Endogenous variables

This reflects that there are also u measure and unspecified causal effects on these variables

These disturbance terms are usually modelled as latent variables, hense they are represented by circles

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What does a path model need to be in order to be analysed?

A

It needs to be identified

There needs to be sufficient unique pieces of information (i.e. Correlations in the observed data) to allow mathematical estimation of the model given the model that has been specified.

Identification can become tricky when dealing with complex latent variables and non-recursive models, but there are some shorthand methods for checking identification in observed variable path models.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is the basic rule for model identification?

A

Maximum number of single connections between observed variables must equal or exceed the number of paths specified in the model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is the formulae to calculate the maximum number of single connections between observed variables?

A

(V*V-1)/2

Where V = number of variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Using the formulae to calculate the maximum number of single connections between observed variables what do you have to compare this number to to check model identification?
Count all of the model pathways (ignoring disturbance/error terms) And then compare these two numbers The maximum number must equal or exceed the number of paths counted in the model
26
There are three outcomes when checking model identification, name and explain these & state which outcomes enable models to then be estimated?
Over-identified model (more correlations than free paths in the model) Just-identified model (saturated model) (correlations equal the number free paths in the model) Under-identified model (fewer correlations than free paths - model cannot be estimated) -only over or just identified models can be estimated
27
What is a recursive model?
This is a model where all causal pathways are moving in the same direction i.e. Effects are uni-directional. (This is the most common form of model and is always identifiers
28
What is a non-recursive model?
This is where there are reciprocal relationships between variables - more complex to analyse - identification issues can be very problematic in complex non-recursive models - not as common in the psychology literature
29
After the model has been specified and constructed what happens?
Model estimation
30
The are two primary interests estimated by the model, what are they?
The direct and indirect effects between variable Global model fit This in the context of regression would by -regression coefficients for individual predictors Test of overall regression model fit i.e. ANOVA for R squared
31
Paths in models can be decomposed into what?
Direct and indirect effects (&error)
32
Explain direct paths
The oath regression coefficients reflect direct relations between one variable and another (controlling for the effect of any other variable also effecting the endogenous variable). These are the same as the beta weights in normal MR (we can obtain these by simply running separate OLS regression models)
33
The number next to a path in a path diagram is what?
They are standardised regression coefficients the beta weights from a regression output
34
How do you interpret direct effect results from a path diagram?
You can fast the significance of (unstandardised direst effects) However, you should consider the magnitude of direct effects not just the sig. (Use last research as a guide, consider substantive real-world meaning of effects, use cohens rule of thumb .1 = small, .3 = medium and .5 = large)
35
What are indirect effects in model estimation?
These are the effects of one variable on another variable via a mediator variable In a standard one-mediator mediated regression there is one indirect effect- the effect of the IV on the DV via the mediator.
36
How do you calculate an indirect effect from a path diagram which shows the standardised regression coefficients?
By multiplying the constituent paths. And then comparing this number to the direct effect pathway the relationship between the two variables should shrink in the presence of the mediator. So the indirect path should be lower than the direct path.
37
Take two variables neuroticism and depression say they have a direct pathway with a standardised regression coefficient of .22 & there is a mediator avoid Neuroticism - avoid = .34 Avoid - depression = .5 Calculate the indirect pathway explain what these pathways show and how the indirect pathway can be interpreted
.34 * .5 = .17 - indirect path neuroticism to depression Neuro roam has a .34 direct effect on avoid but only .5 of this is transmitted to depression via avoid The indirect pathway means that an increase in depression of .17 SD units for every 1 SD unit increase in neuroticism via the effects of avoid
38
What does total effects refer to in model estimation?
Total effects represents the total causal effect of one variable on another This is calculated by summing all of the direct and indirect effects
39
What are the two tracing rules? Nb given in the lecture slides
You cannot enter and exit a variable on an arrowhead You cannot enter a variable twice on the same trace
40
What is the primary goal of constructing a causal model?
To express relationships between variables in terms of direct and indirect effects, based on a causal model assumed to be correct (to qualify degrees of causality)
41
Discuss model fit statistics in the context of plausibility
Plausible model is constructed independently of analysis using non statistical means Model for statistics can give some indication of model plausibility NB correlation does not equal causation & we cannot determine causal direction statistically
42
List 4 processes that can be used to help specify causal direction of paths (& thus construct a plausible model...prior to data collection)
1. Time precedence 2. theory 3. Previous research 4. Logic/sound rationale
43
Kline, 2005 discussed 4 empirical conditions that must be met to support causal inference, discuss these...
1. Relationship: X should be correlated with Y 2. Temporal precedence (X must precede Y in time) 3. Non-spuriousness (X-Y relationship should hold after controlling for other variables experimentally or statistically e.g. Third variable issue) 4. Correct effect priority (there are no reciprocal relationships between X and Y, or Reversals of this relationship)
44
How do you work out the degrees of freedom in a path model?
The difference between the full saturated model and the reduced model E.g. If the full model could have 10 pathways and 8 were specified DF = 2
45
What is path analysis?
In its most basic form it is a simple extension of multiple regression
46
When would you use a disturbance term and an error term in SEM?
Disturbance terms point towards latent factors and error terms to measured variables
47
What is the major question asked by SEM?
Does the model produce an estimated population covariance matrix that is consistent with the sample (observed) covariance matrix? Basically is the constrained model consistent with the saturated
48
What will the chi-square statistic and degrees of freedom be for a saturated model?
Both will be 0
49
If the chi square statistic for the reduced (default) model is sig. What can you conclude?
Bad fit of the reduced model to the data
50
What is a problem with be chi-square test?
Sample size - with large samples, your model likely to be sig. worse even when differences in fit are substantively small
51
What is the independence model?
It's a model that specifies that all of the relationships between the variables are 0 so it will always be a bad fit to the data It is used as sometimes fit indices actually compare the default model to the independence model 'how much better is it?'
52
Describe the standardised root mean square residual (SRMR) and say what a good fit would be
A residual correlation is the difference between a sample correlation and the implied correlation The SRMR is based on the average absolute values of the residual correlations An SRMR of zero would equal perfect fit (no residual) SRMR
53
Described the root mean square error of approximation (RMSEA) and Browne and Cudeck (1993) suggestion of fit level
-popular fit measure Designed to Asses the approximate fit of a model rewarding parsimony Of two models with similar explanatory power the simpler model - fewer paths (DF) will be favoured
54
Describe the goodness of fit index (GFI) and Hu and Bentler (1999) guidelines for fit
- different approach to model fit Compares researchers model with the independence model (independence model predicts all variables are independent i.e. Zero correlations) Analogous to R2 - estimates total variance accounted for by our model. GFI > .95 = good fit GFI > .90 = adequate fit
55
What is the purpose of model fitting and what limitations does it have?
Is to rule out bad models Limitation is that it cannot prove a good model Bad model fit means that the model doesn't explain the data as well as others might Good model fit - fails to disconfirm your model, you may have a good model but 'fit' is with reference to the variables in your model (alternative models with different specification paths might be even better - still worth testing alternative models & maybe that ther is a more complete model (more variables))
56
What is full SEM?
Extends observed variable path analysis by creating a latent variable measurement model, and then examining relationships between these latent variable factors
57
What is the two-step process for SEM?
Specify and estimate a candidate measurement model (aka confirmatory factor analysis) Once you have a viable measurement mode, you re-specify the model as a structural Model and examine the relationships between latent factors
58
Confirmatory factor analysis (CFA) models test measurement models how and what does it do?
They are used to test theoretically derived models of psychological measures Often used in the development of psychological measures after having used EFA (exploratory factor analysis)to initially develop and refine the measure Once we have an EFA services measure we can administer it to a new sample and see if we can confirm the original measurement model Can tell us important information about how a measurement tool is saturated and/or how latent factors refer to each other
59
Discuss CFA vs EFA
The principles underlying CFA are largely the same as those in EFA Before undertaking a CFA we should use the same assumption checks & data screening as EFA The typical difference between the two is that in CFA we constrain factor loadings (usually to be 0) I.e. We do not allow all observed items/indicators to load freely on all of the factors So the CFA model is a more constrained version of the EFA model
60
What is an indicator variable and how is it represented in SEM?
Are measured or indicator variables (observed variables) And a represented by a square
61
What do the factor loadings do?
Estimate the relationship between the factor and the observed indicator Can be thought of as the correlation between the factor and the indicator in standard CFA models Typically like these to be >.50
62
What are the factor covariances in CFA?
Estimate the relationship between latent factors we can use this information to examine the convergent and discrimination validity of the factors
63
What are error terms in CFA?
These model variation in the indicator variable not accounted for by the factor e.g. Anything else that accounts for variance in the indicator variable - other influences and error These error terms are usually uncorrelated with each other, but you could model error correlations if you expected that response across indicators would be caused by something other than the factors e.g. Method effects
64
Describe the 8 steps of designing a CFA model
- refer to theory/previous research to a certain appropriate level Specify the model Model identification Model estimation Testing model fit Interpret model effects Modifying models Reporting results
65
When designing a CFA model you first need to specify the model what should you fix the error terms to?
As you cannot know the variance of unmeasured variables Fix the error variances to 1 in model specification Or fix raw error loadings to 1 (AMOS default) - sets error variance based on indicator variance This is important for model identification
66
When designing a CFA model you first need to specify the model what should you fix the factor variance to?
Factors are unmeasured so variance is unknown Fix factor variance to 1 or set raw factor loadings to 1 (Only need one factor loading to be set to 1 per factor) This is important for identification of the model
67
What does CFA use to estimate unknown values e.g. Factor loadings, in the variance/covariance matrix?
Known values Number of knows = V*(v+1)/2 Where v equals the number of variables
68
How can you find out the model is identified?
Calculate the knows v*(v+1)/2 And the unknowns (count up number of free paths and variances) Subtract the unknowns from the know a to get DF If model DF greater than or equal to 0 then proceed i.e. The model is identified If not you need to re specify your model
69
There is a simple heuristic for standard CFA models e.g. Models with uncorrelated error terms and where each indicator loads on just one factor- what is it?
If a model with a single factor has 3 or more indicators it will be identified If a model with 2 or more factors has 2 or more indicators or factor it will be identified
70
In CFA was are we typically looking to do? (Model estimation)
Estimate model parameters e.g. Factor loadings and factor covariances Test global model fit (We can also then compare the fit of competing measurement models, specify alternative models etc)
71
In a CFA model what rule of thumb is used to suggest that two factors may be redundant?
If the factor correlations > .75- .80 then this may suggest that the model is 'over-factored' or that one of the factors is redundant -a more plausible model might involve collapsing the factors in to one and re-estimating (this is where you would also need to rely on what theory and previous research suggests)
72
In a CFA model if factor loadings are low on to an indicator then what may this suggest?
That you should possibly remove this indicator from your measurement in the future (I.e. If a questionnaire and your factor does not load highly on to item 6 maybe this item is not really tapping into the factor that you want so remove it as it just adds noise to your data)
73
In CFA what model fit indices should you use to evaluate the global model fit?
The same as in path analysis Residual correlations (sample correlations minus implied correlations - sample correlations are observed correlations; implied correlations are calculated From the model loadings - smaller residual correlations = better fitting model - larger specific residual correlations may indicate that part of the model is misspecified) Chi-square (examine the fit of an individual model - comparing model with observed data, so we want a non-significant chi-square value i.e. No significant difference between model and data - can also directly Test differences between chi square nested (hierarchical) models using difference between model DF as critical chi-square value) RMSEA & GFI as well! SRMR (average absolute value of the residual correlations - so the closer to 0 means perfect fit - SRMR
74
Where do you look in Amos to check the chi-square and GFI for a CFA?
Check the default model in Amos (non significant = good)
75
What does the RMSEA need to be to be a good fit?
Very close to 0
76
Explain model building in CFA context
Start with a bare bones model and then add path(s) If extra paths significantly improve fit these are added to the model
77
Explain model trimming in the context of CFA
Typically start with a saturated model and simplify it by eliminating paths If the model fit does not sig. Deteriorate then paths can be removed (model is no worse but simpler - more parsimonious)
78
Discuss a model building example
Calculate chi-square for first model & then second Calculate difference between the chi-square statistic for each model If chi-square is sig. then the model is sig. improved by adding paths and these can be retained in your refined model NB when checking if the chai square statistic difference is sig. you use the difference between the models DF to then look up the statistic in the table
79
What are modification indices (MI)
MI chan be used to add individual paths to the model These are an output from Amos The large the MI the greater the improvement in model fit Usual conventions is MI > 4 suggests an improvement in model fit and path should be added
80
If you have 2 models one is the saturated so has 0 DF and the 2 model has a DF of 2. The difference between the chi-square statistic for the 2 models is 4.03. Look up in a table a chai square for DF 2 and you get a critical value of 5.99. Is the new model a better fit to the data or not?
Yes as 4.03 is smaller than 5.99 so there is not a sig. Difference so the new model does not have a sig. Worse fit to the saturated so as its more Parsimonious it is accepted.
81
To use a chi-square test for differences between models the models must be nested, what model indices do you use if the models are not nested?
AIC and BIC
82
When model trimming/building is guided by theoretical a priori considerations what is this approach called?
Theoretical approach
83
Explain the empirical approach of respecification
Paths are added or deleted from model purely based on statistical criteria In model building MIs for all paths are examined to see which ones significantly improve the model Can capitalise on chance correlations This type of SEM is more Exploratory Credibility of model improvement if model structure replicated in another sample
84
Describe the extension to SEM that looks at multiple-group SEM
- test an SEM across a categorical variable e.g. Gender We might want to look at model estimates in different groups, or see whether a particular model holds across groups i.e. It is invariant across groups This can be done for CFA model or a full SEM Uses the principle of iteratively constraining parameters in the model to equality across the groups (implying they are the same in each group), and then looking to see if this produces a significantly decrement in model fit If a sig. Decrease in model fit occurs, you then have to identify which parameter have caused this problem I.e. You can iteratively free parameters to identify the source of the misfit
85
Multiple-group SEM is often undertaken across a series of steps, what are these?
Estimate the model simultaneously in the groups, freely estimating all of the model parameters - this is often referred to as a test of configurable invariance If the above model shows good fit, you could then test a further model that constrains the factor loadings and/or path coefficients to equality across the groups - if this model shows good fit, you can then assume the model parameters are consistent across groups. If not, you can iteratively free paths to diagnose the ill- fit and establish what is referred to as partial invariance
86
What are the assumptions for SEM?
Apply to correlation/ regression Linearity (dependent (endogenous) variable should be linearly related to IV's (exogenous)) SEM programmes can handle continuous and categorical variances, but check for coding of categorical variables and make sure programme knows what codes are being used Normally (residuals should be nor annoy distributed and homoscedastic) Disturbances uncorrelated with endogenous variables No multicollinearity Exogenous variables are reliably measured ``` Additional Identification (models cannot be under-identified) ``` Adequate sample size (Kline recommends at least 10 times as many cases as parameters (paths) - ideally 20) Proper model specification (specification errors occurs when common causal variables are left out of the model)