PCA Flashcards
(32 cards)
PCA
Principle Component Analysis
What does PCA allow you to do
Do all of the questions in a questionnaire tap into the same underlying construct
PCA - correlate
Looks at which items correlate (R-Matrix) with each other and calls them the component/factor
e.g. component 1 - BEING AWFUL
items = saying “im not racist but” and disliking dogs
e.g. component 2 - BEING SOUND
items = smily idk
Kaiser’s rule
Eigenvalues greater than 1 mean the component IS VALID
Eigenvalues
You use the Eigenvalues to judge whether the components are worth keeping
Proportion variance
How important each component is - the percentage of variance that the component explains
Cumulative variance
Adds the percentages of variance up
Component loadings
The PCA gives us a number of components, but this doesn’t tell us which of our individual measures make up each component
Component loading tell us this - they tell us what the association is between each item and each component
Essentially they are a Pearson’s correlation between the item and the factor/component
They give a component matrix
What component loading is taken as a strong enough loading
.4
PCA - simplifying data
A PCA allows us to take many items and reduce the dimensionality of a construct
All these different measures and reduce it down to the core factors/components
If 3 items are all highly associated with each other, awe can combine them into a single measure
What does reducing the dimensionality of a construct reduce the likelihood of
Reduces the likelihood of false positives - because we test the one construct not all three separate measures
Why do we need to collect a lot of participants with PCA
PCA is based around an r matrix Pearson’s correlation
Questionnaires are Likert scales - non parametric
SS Loadings
Eigenvalues
Scree plot
Scree plots simply graph Eigenvalues and are another way to judge the number of factors/components
When the graph flattens there are no more relevant factors
Joliffes rule
Eigenvalues over .7 are valid in a PCA
What figure in the component matrix is a significant loading
Correlation coefficient of .4 and above
Can component loading be negative
Yes - just means the items are negatively associated with the component
Can the measure load onto more than one component
Yes
Rotation
Simplifies the data and allows us to make inferences from it easier
Changes what these component loadings are to make it clearer for us
Rotation shifts the factors in two-dimensional space to make factor loadings clearer and easy to
interpret
Orthogonal rotation methods
Assume that the factors in the analysis are uncorrelated, most commonly used is the varimax rotation
Oblique rotation methods
Assume that the factors are correlated, most commonly used is the oblimin rotation
e.g. questionnaire that uses anxiety and depression
Where do the factor names come from
You just name them based on what you think the questions seem to measure
PCA assumptions
Data should NOT be nominal
Need to consider sampling adequacy
Sufficient correlations between individual variables is needed to run a PCA
Sampling adequacy
This assesses whether or not PCA is appropriate for your data. This is measured by something called KMO
It assesses how much variance among all your variables might be common variance (i.e. explained by an underlying component or factor)