FA & PCA Flashcards Preview

MRes Statistics Exam COPY > FA & PCA > Flashcards

Flashcards in FA & PCA Deck (50)
Loading flashcards...
1
Q

what is exploratory FA?

A

Explaining relationships between observed variables with a smaller number of factors. Latent factors that theoretically underpin data. Exploring.

Exploratory FA - An exploratory technique; decision making during FA is guided by the outcome of the analysis. Reducing the number of variables. Exploring number of factors in dataset. Help discover meaning and importance of factors by examining variance in observed variables they account for.

2
Q

what is PCA?

A

Not theoretically driven – Just reducing data down to a smaller number of components. Don’t care about latent factors. Does not assume there is something latent out there driving the relationship. Just want to reduce down to simpler lower dimensional set of scores. E>G –> too many variables to satisfy assumptions of a regression (too many predictors / or highly correlated) …. So, use PCA to reduce them down to a set of unrelated components. Some people label them, but need to remember …. Not latent factors!

3
Q

what does having more items or variables in FA achieve?

A

More items on qu. The better - the more you measure the same thing, the more accurate and reliable it gets.

4
Q

what is confirmatory FA?

A

An explicit, theory-driven model is tested by placing various constraints on relationships between observed variables and latent factors, and between latent factors. Can derive estimates of particular parameters and overall model fit. SEM AMOS etc. Reducing the number of variables. Theory testing. Testing replicability of factors across time.

5
Q

how does Exploratory FA work generally?

A

Takes a correlation matrix. Uses observed relationships between variables to derive factors that best explain the observed variance. Loadings between items and factors - strength of loading determines factors

6
Q

How does labelling factors work in FA?

A

Choose clear labels for factors – FA relies on decision making of researcher, pragmatism and knowledge of research area. Sometimes called a ‘last ditch attempt’ to save dataset.

7
Q

what is Coverage?

A

Item coverage – if you don’t have proper coverage over all topics (e.g. big 3 and then expanded to 5 personality factors), then you cant claim to have a comprehensive coverage = garbage in garbage out.

8
Q

What are the three types of variance in FA/PCA?

A

So, in FA, items are influenced by variation in various things across people. Three things to be exact. What is it that is causing people to vary? 1) Shared variance between items 2) Unique Variance for one item (stable variance over time – meaningful – independent of other items) 3) error variance – (random in ideal case, you will vary in your score across times of measurements)

9
Q

what variance does FA measure and use?

A

FA measures the SHARED Variance – minimal error and unique variance

10
Q

what variance does PCA measure and use?

A

PCA – tries to account for ALL the variance (1’s in diagonal)

11
Q

What are the technical differences between PCA and FA in regard to the use of variance?

A

This causes a technical difference with the way you set it up. In FA, you start process (iterative) by saying, each factor you use multiple regression to estimate the shared variance of the items of that factor. FA replaces 1’s (corr with itself) with estimate of shared variance on a particular factor with that item, using regression, start process and it iteratively improves the solution. PCA use the corr matrix as it is. You can tell from the initial communalities output which is being used.

12
Q

where are the squared multiple correlations?

A

The squared multiple correlations (SMC) are diagonals of the correlation matrix

13
Q

what is PCA for generally?

A

PCA is for reducing and extracting linear composites from data for further analysis

14
Q

What does FA assume the relationship between factor and variance is?

A

FA Assumed factor causes variation in the observed items

15
Q

What does PCA assume the relationship between components and variance is?

A

PCA Not causal, goes other way round. PCA component is driven by the items. The component is a representation of the items. It is an aggregate of the items. Not caused by them. Aggregated weighted fashion.

16
Q

what practical issues are there to consider?

A

1) sample size
2) missing data
3) normality
4) outliers
5) Multicollinearity
6) FACTORISABILITY
7) EXTRACTION
8) ROTATION

17
Q

what is the rule of sample size FOR FA/PCA?

A

Only rule of thumbs. Generally large sample sizes. Comrey and Lee (1992) suggest: 50= v poor 100=poor 200=fair 300=good 500 v good >1000 excellent. Smaller N ok if higher loading markers on factors.
SOME say ratio of participant to variable – Nunnally 10:1, Guildford 2:1, Barrett & Kline find 2:1 replicates structure while 3:1 is better)

18
Q

How do we deal with missing data?

A

Imputation methods tend to over fit the data – essentially create data based on data that is there. Going to inflate the relationship and push correltions. Large samples = small amount missing is probs ok, casewise/listwise (diff) deletion might be ok.

19
Q

what are the issues with outliers/ways of dealing with them?

A

Bad news. One point outliers can have a massive effect on correlations (ellipse) screen em out. Recall leverage, influence etc.

– check scatterplots / hists
– Malhulonobis distance 
– dummy code them out and delete
– transform variable 
–windsorize
20
Q

what is listwise deletion?

A

Exclusion of cases listwise means that if there are a set of variables to be used in the PCA or FA then for a case’s data to be included that case must contribute a datapoint for each variable. If the case has missing data for any variable the case is deleted. The alternative would be to estimate the correlation matrix (for FA or PCA) based on the maximum amount of data for each variable concerned.

21
Q

what is casewise deletion?

A

The default way of deleting missing data while calculating a correlation matrix is to exclude all cases that have missing data in at least one of the selected variables; that is, by casewise deletion of missing data. Only this way will you get a “true” correlation matrix, where all correlations are obtained from the same set of observations.

22
Q

what is multicollinearity and its issues in FA?

A

Any technique that requires you to invert the matrix (such as FA or multiple regression) if corr between sets of variables is too large, it becomes problematic. We want correlatons with one another, good, but not too strongly, as FA cant be done, as dividing by error issue, MC caused by inverting the matrix.

This is a multi issue. If some of ur variables predict another variable very well, we have an issue. Multiple R of about .9.

23
Q

what are communalities ?

A

they represent the amount of overlap in the variables : represent amount of overlap in variables using squared multiple correlations (ie each variable with all other variables as predictors), if >.90 problematic.
like a multiple regression
Check initial column. In FA. In PCA, these are all set as 1. Another way to identify them by output.

24
Q

what is factorisability ? and how to test

A

Correlations need to be >.30 in matrix

Ð Test it using Bartlett’s test of sphericity – not that helpful – sensitive to sample size – identifies sig large correlations,,,, but yeah not great.

Ð Anti-image correlation matrix – contains the negatives of partial correlations between pairs of variables with effects of other variables removed. Look for small values on off- diagonal elements. There shouldn’t be much left after partialling out other variables and factors. Should be low values.

Ð KMO – Kaiser-Meyer-Olkin – Aggregate measure for the whole corr matrix, expressing size of Anti Image Correlations, scaling it all between 0-1, and should be above >0.6.// .8-.9 good.

25
Q

what is EXTRACTION

?

A

Factor extraction is an iterative process – attempts to maximise the variance explained

26
Q

what are Extracted communalities?

A

Extracted communalities represent the amount of variance in each variable explained by the factors (i.e. variables as DVs and factors as IVs)

27
Q

what are Sum of Squared Loadings (SSL) ?

A

Each item has a loading on each factor. You want loadings to be high on one factor and not another. Id square all them, gives you the communalities after extraction.

28
Q

what are the initial EIGEN Values? And what are cumulative eigen v’s?

A

Eigen Values are the amount of variance explained by each FACTOR
Cumulative EIGEN values then show how each factor contributes to overall variance explained. You want to pick out the most amount of variance with least amount of factors. HOW many are worth retaining.

29
Q

what is most common extraction method ?

A

Most common is Principal Axis Factoring

30
Q

how to discriminate how many factors to extract?

A

Use scree plot and look at POINT OF INFLECTION – the ELBOW of the curve. SUBJECTIVE – DISCUSS IN RELTION TO THEORETICAL THRUST in exam.

More complex: Parallel analysis
• Creates a random dataset with same number of cases and variables.
• Run FA/PCA on random data and generate averaged eigenvalues.
• Compare real eigenvalues and generated eigenvalues, and retain eigenvalues from the real dataset that are higher than those from the random dataset.

31
Q

what is rotation?

A

Rotates the axes of the factors to align better with variables. If there are items that load on too many factors (not clean), rotation can enhance interpretability by moving the variance around between the factors. But with a target t simplify interpretation. Trying to get to 1 factor per item.

32
Q

what types of rotation?

A

Orthogonal

oblique

33
Q

what is orthogonal rotation?

A

Orthogonal – keeps factors at 90 degrees (uncorrelated / independent factors) VARIMAX (maximises the variance across factor loadings) – all psych is arguably overlapping factors though.

34
Q

what is oblique rotation?

A

Oblique – Factors are allowed to correlate. More realistic.

35
Q

what is the most common oblique rotation?

A

Oblimin

36
Q

what do we interpret orthogonal rotations using?

A

When an orthogonal rotation is used, a factor loading matrix is produced – use this for interpretation.

37
Q

what do we interpret oblique rotations using?

A

– pattern matrix – factor loadings but after partialling out overlap with other factors. Best for interpretation.

Use pattern of factor loadings to help label and define factors

38
Q

details of a screeplot?

A

A scree plot is a method for graphically determining the number of factors to be retained in the analysis. It is achieved by plotting the eigenvalues (which reflect the amount of total variance/covariance explained by the factor) for each factor in size order. The number of factors or components to be reatianed is determined by the elbow in the plot (where the size of the eigenvalues changes relatively little from one fator to another). Accurate to +/- a factor or so. Some debate about whether to include factor at elbow or not.

39
Q

what is varimax rotation?

A

Varimax rotation is a form of orthogonal rotation of the solution (ie all factors are uncorrelated) which is designed to maximizes the variance of the factor loadings over the variables (hence the name). This simplifies the factors by having either high or loading variables and avoiding variables with mid-loadings. This makes factor interpretation/labelling easier (hence popularity of method).

40
Q

what is KMO?

A

KMO measure of sampling adequacy (MSA) and Bartlett’s test of sphericity are means for establishing the factorisability (or factorability) of a correlation matrix. In other words for checking whether there are meaning ful relationships between subsets of the variables which can cluster into factors/components. KMO has to be >0.6 to indicate factorisability. Bartlett’s test being significant means that the hypothesis that there are no factors can be rejected but this test is overly sensitive.

41
Q

what is Bartlett’s test of sphericity?

A

KMO measure of sampling adequacy (MSA) and Bartlett’s test of sphericity are means for establishing the factorisability (or factorability) of a correlation matrix. In other words for checking whether there are meaning ful relationships between subsets of the variables which can cluster into factors/components. KMO has to be >0.6 to indicate factorisability. Bartlett’s test being significant means that the hypothesis that there are no factors can be rejected but this test is overly sensitive.

42
Q

what is The anti-image correlation matrix ?

A

The anti-image correlation matrix (AICM) is another means for determining factorisability. To get the off-diagonal elements of the AICM one calculates the partial correlation between variable X and Y partialling out all the other variables (and then multiply this correlation by -1). Even if X and Y are related then, if other variables covary with X and Y (ie can form a factor with X and Y), the partial correlation between X and Y will be small. So the off-diagonal elements of the AICM should be zero. The KMO sampling adequacy measures for each variable are put on the diagonal of the AICM and these values should be as close to 1 as possible.

43
Q

what should be mentioned when talking about the factors retained?

A

Should comment on the decision to retain x factors. Given the expected x factors, hypothesis testing could be justified, Catell’s Scree, Kaiser’s Eigenvalue, interpretability (the no. of factor that produce a ‘meaningful’ solution) or other methods could have been used. Hypothesis testing + at least 1 other should be mentioned. The consistency across these methods, with all seeming to suggest 3 factors, is also worth noting (although scree not shown, eigenvalues are listed).

44
Q

what are communalities and importance?

A

• Communalities.

The communality for a variable is the variance accounted for by the factors.

Extracted communalities (or h2) represent the proportion of each variable’s variance that can be explained by the retained factors.

Read their output and say: In this instance communality values are xx suggesting a xxx solution with xx factors retained.

45
Q

how do you calculate communalities?

A

Communality is the sum of squared loadings (SSL) for a variable across factors

46
Q

how to calculate the proportion of variance in the set of variables accounted for by a factor?

A

The proportion of variance in the set of variables accounted for by a factor is the SSL for the factor divided by the number of variables (if rotation is orthogonal)

47
Q

how to calculate the proportion of variance in the solution accounted for by a factor

A

The proportion of variance in the solution accounted for by a factor—the proportion of covariance—is the SSL for the factor divided by the sum of communalities (or, equivalently, the sum of the SSLs).

48
Q

what types of factor computation methods ?

A

SPSS can use a variety of methods to calculate factor scores e.g. regression, Bartlett, Anderson-Rubin. aggregating standardised scores

49
Q

what are factor loadings?

A

Factor loadings are the correlation of each variable with each factor. Factors are defined by high loadings.

50
Q

FACTOR ANALYSIS SCREENING PROCEEDURE?

A

1) Normally distributed (frequencies / graphs)
2) No univariate, bivariate or multivariate outliers (freq / scatterplots)
3) Normality not required (but helps clarity)
4) Illegal values (out of range of given data, e.g. <1 on a 1-7 likert)
5) Restriction of range in the data (bias questions like… do you like your psych course)
6) Collinearity & Singularity (see corr matrix / high SMC values)
7) Factorisability of corr matrix - rue thumb need bivariate corrs above 0.3 + look for low partial pairwise correlations and partial out all other variables (can use KMO)
8) Distinction rest: FA can only produce what’s put in - question wide enough set items; quality of ratings and checks (possibility of halo effects, perhaps assessed with the first un-rotated factor)
9) Sample size - Note no single opinion on matter but mention matter (e.g. Comfrey & Lee = a minimum of 300 cases for a good factor analysis or ratio of cases to variables - Nunnally 10:1, Guildford 2:1, Barrett &Kline find 2:1 replicates structure while 3:1 is better). Current data look……
10) Ratio of variables to Factors - (as above e.g. Tebachnik & Fidell 5 or 6:1; Klein 3:1; Thurstone 3:1; Kim & Mueller 2:1). data is ….. as there are not lots of factors or items which don’t correlate with other items.
11) Which type to use: listwise, pairwise (to be avoided) or imputation. Always listwise if numbers allow. Good answer may discuss different forms of imputation (regression, mean).