module 9 Path analysis Flashcards
(33 cards)
collab (comparing cfa and efa)
HOW IS FACTORIAL ANALYSIS USED?
-provide evidence for the existence of psychological constructs & ways to measure them. eg scale development
-provide psychometric info on validity of psychological measures (esp Factorial validity)
HOW DOES FACTOR ANALYSIS WORK?
-Factor analysis looks for patterns of relationships b/n continuous variables. eg some items may not clump together, so maybe we should get rid of them or place them on a different Factor etc.
-the relationships are assessed using correlation coefficients ((r)
-correlations are a measure of how much 2 variables vary together (shared variance)
EXPLORATORY FACTOR ANALYSIS
-aims to reduce variables/items through extracting factors from the data, and revealing underlying patterns of the items across factors.
-the researcher/statistician actively obtains simple structure through various statistically-driven rules.
-uses spss, eigenvalues,scree plots, extraction methods etc.
collab2
EFA lets us see how many factors and which items load on what factor.
Rotation is kind of like fine tuning spectacles in terms of does the item fit best on this factor or another factor.
CFA;
-we specify where each item should load acording to theory (ie may well have already done an EFA)
-confirms factorial validity in a new population,setting, or purpose.
-is one type of Structural Equation Model (SEM) (Path analysis is another type of SEM)
-uses AMOS
collab3
BASIC DIFFERENCES BETWEEN CFA & EFA
1. Efa;
-there are no specifications given as to which item should load on which factors
-used when we are exploring how many factors are present in our data
-statistically driven
-continuous variables (no DV’s)
-efa finds the amount of shared variance in all variables first and defines this as the first factor. Hence the eigenvalue or amount of shared variance in the first factor is often the largest
2.Cfa;
-there are direct specifications given as to which items should load onto which factors and;
-how each of the factors should be correlated/uncorrelated
-theoretically driven
-in cfa, all items only load on their specified factor, not for all factors as a comparison
-in cfa, this overlap (cross loading) is not allowed for
-in cfa, shared variance of items is only for each factor.
collab4
AMOS
In CFA circles represent unobserved or latent variables (in cfa these are our proposed Factors), and squares represent measured or observed variables.
Double-headed arrows represent correlation (r)
One-way arrows are similar to factor loadings (item predicts factors). In Amos they are called regression weights and we use standardised regression weights (SRW) to represent the strength of the path as b/n 0 and 1. (similar to factor loadings in efa).
if you fail to draw arrows where you want them, AMOS will not calculate it.
In CFA Items of SRW below .50, are considered to be unacceptable (not loading strongly enough). (for efa often accept .30 or above)
collab 5
3 asterisks denotes .001 and all are sig at that value.
Do not get C.R. for efa, only for cfa.
Tend to look at the standardised regression weights more than the unstandardised. The SRW are the important how much does the item load on the factor info. (want to be >.5)
collab6
s11r SRW=.63 from previous slide
collab7
In Efa, a communality is the amount of variance in an item that is captured by the factors, and generally want thos factors to be above .20. Because Efa is exploratory, the communality is adding up all the variance that every factor can explain in the item…. big or small.
But in CFA, the SMC’s are telling us a similar thing, how much variance in an item is explained by the factor. In this case it is not all factors, but only one factor (as we told the CFA this item belongs on just the one factor).
Ultimately in both cases, the bigger the value, the better. In CFa, want the values above .25 (this links in with SRW of .50 or more )(so 25% of the item variance explained by the factor we reckon it belongs to).
collab8
collab9
so are our factors correlated? No because r= -.02, p=.812 =not sig. So the factors could be called orthogonal as they are not correlated.
collab10
Because Chi squared statistic if often sig, when we wish it were not, there are many other ways of assessing the model fitness. There are numerous Modification Indices which are extra paths which may help to reduce the chi squared value. eg maybe add a path connecting an item to a different factor (ie allowing cross-loading) etc. The larger the chi squred value is, the more likely it is to be sig. the modification indice will say by how much can reduce the chi squared value down by, if do do the modification suggested.
collab11
the model with the lower chi squared statistic, is the better fit.
further amos you tube video
Exogenous variable has no predictors associated with it.
Endogenous variables- are predicted due to upstream variables.
Can also draw eg mediation/moderation models etc.
If have missing date, must click on estimate means & intercepts.(if no missing data, don’t necessarily need to click on this).
Have many options, eg can get indirect and direct effects, normality tests, modifcation indices etc.etc
If get a p value of *** it actualy means p<.001
Be aware that if are shown some possible model improvements, they should make theoretical sense!
Path analysis
Path analysis is a way of constructing mediation models, which you have already examined. It is another type of structural equation modelling that assesses paths between variables, and it is all about prediction. Similar research questions to those that you can try answering with multiple regression would be answered by using a path analysis; however, there will be more research questions you can answer when you use path analysis.
Path analysis is also about looking at more complex models with more than one dependent variable and lots of different pathways connecting variables to each other, hence the name. A path is similar to a regression equation with beta weights. In Amos, you get regression weights. They have the same name as the factor loadings last week in confirmatory factor analysis.
You can test this in SPSS; however, the downside in SPSS is that you have to do many different equations, and you can’t do everything all at the same time. With path analysis in the structural equation modelling framework and in packages like Amos, options are limitless.
Below is the path analysis which seeks to answer;
1. Is post-natal depression a result of parental stress?
2. Is parent stress a result of having a difficult child and stressful life events?
path analysis 2
What you are doing is testing a path, which is just like testing a theory about a process. What leads to what, and then what leads to what else? These are the questions you are trying to answer with path analysis.
Here is a second path model to analyse. In this second model, you will notice that there is a direct relationship, with a beta
weight assigned, between having a difficult child and having post-natal depression.
The main question of this second model is: does a difficult child influence their mother’s level of post-natal depression, regardless of how stressed the mother feels?
Thus, path analysis allows you to test different research questions with the same variables. Using the models shown on this page, you can also test other types of research questions, such as:
Does a difficult child influence their mother’s level of post-natal depression, independent of other factors? This would be represented by the direct connection between a difficult child and post-natal depression shown in ‘Path model 2’.
Do life events and the level of parent stress contribute directly or indirectly to post-natal depression?
Here you can investigate two separate pathways to post-natal depression. One pathway would be having a more difficult child, which increases the severity of post-natal depression, regardless of how much stress is felt as a parent and in life events.
There is another pathway in which a stressful life leads to feeling more stressed as a parent, which will also lead to an increase in the severity of post-natal depression. Path analysis allows you to test different theories and different pathways towards different variables.
path analysis 3
In ‘Path model 3’, another example might be whether a difficult child influences their mother’s level of depression because they make parenting more stressful, which is a classic mediation hypothesis.
In this model, the hypothesis is that having a difficult child leads to higher severity of post-natal depression because having a difficult child increases the stress in the parenting role. In other words, parenting stress mediates the relationship between having a difficult child and post-natal depression.
indirect and direct pathways
Another main aim of path analysis is to look at direct and indirect pathways.
DIRECT EFFECTS
The values directly connecting two variables are known as direct effects. For example, in ‘Path model 3’, parent stress and difficult child are direct effects of post-natal depression. Beta
weights are given for each path.
INDIRECT EFFECTS
You can find out indirect effects by multiplying the BETA
weights for each relevant path. In ‘Path model 3’, the indirect effect of a difficult child, through stress, on post-natal depression is calculated by multiplying the beta
weights for difficult child and parent stress (i.e. 0.33 x 0.48 = 0.16). You can also calculate the total effect of having a difficult child on post-natal depression by adding the indirect and direct effects together (i.e. 0.16 + 0.22 = 0.38).
Remember that you cannot test actual causality unless your research design lends itself to such a conclusion.
structural equation modelling
In structural equation modelling (SEM) programs, you can simultaneously test a number of hypotheses relating to the measurement of observed variables and the relationships between unobserved or observed theoretical constructs. SEM programs consist of two major components:
1.Measurement model (CFA): A measure of how the observed measures are generated by the theoretical constructs (or latent variables).
2.The structural model (path analysis): A test of the hypothesised relationships between our constructs (measured by observed or unobserved variables).
There are several benefits of using SEM to calculate the values of beta:
-You can get everything from SPSS multiple regression and more.
-You get a measure of the overall fit of the data.
-You can test various path models to see which best fits your data.
-You can also test the same model across different groups.
-The order of the variables does not matter, as SEM calculates the values simultaneously.
-You can test relationships between error variables.
sem 2
eg how many prescription drugs people take?
We have measured their attitude to prescription drugs. the model may not be great conceptually, but is just an eg.
identification of models
Sometimes models will not run, for a variety of reasons. The main reason is because a model is “not identified”. the below example shows “1’s”. These are necessary for the model to be identified.
In order to empiracally evaluate a model using SEM (ie calculate the estimates of a model), the model needs to be identified. In other words, there needs to be enough physical data present. One of the requirements for identification, is that the scale of all error variables and the scale of each latent variable has to BE SET OR FIXED AT ONE.
Identification of models2
Each error to observed variable needs to have a scale and at least one path from the factor to an observed variable needs a scale. That is what the “1’s” are below. this is thus making these scales the same. must set this because we don’t actually know how the unobserved is measured. without the 1’s, it will not work!
Also, need to have enough data for the model to be able to work. Need to have at the bare minimum, more Number of Data Points than Number of Distinct Parameters to be estimated.
identification of models 3
All the covariances and the variances are data points.
identification of models 4
What is a parameter?-Anything that needs to be calculated.eg;
Correlations
SRW’s,
Value of the variances of the latent variables etc
Below shows all the parameters to be calculated (ie each circle and each arrow). Technically if have assigned”1” (these will be labelled as “fixed” in the parameter summary output)on the scales these do not need calculating.
ie Small models are better.(= less parameters to be calculated).
“Unlabelled”=number of one way arrows that are not fixed to 1, in the parameter summary.
“Distinct parameter” is any parameter that does not have the 1 label on it.
ie Need more data points than distinct parameters.
worked eg of path analysis
Shows standardised regression weights and the squared multiple correlations are shown in blue.
So Physical healthand Stress are only accounting for approx 2% of the variation in Attitude to drugs. And Attitude is explaining approx 9% of the variation in Drug use.
worked eg path analysis 2
Shows Goodness of Fit. Big chi square number and highly sig (not good).
But the Chi squared statistic is sensitive to sample size and is often sig in small samples. It is more often sig than not. Might have a lot of similarities in the models (chi squared isassessing this) but a small area of difference, and this is often sufficient to make the chji squared sig. Not recommended to use only the chi squared stat to say if model is a good fit or not.
Various (50+) comparative Goddness of Fit Indices have been developed to try to overcome this.