WK 9 Flashcards
What are data/dimension reduction and what do they do?
They are mathematical and statistical procedures that reduce a large set of variables to a smaller set
What is the goal in principal components analysis?
Goal is to explain as much of the total variance in a data set as possible
What are the steps in principal components analysis?
-starts with original data
-calculates covariances (correlations) between variables
-applies procedure called eigendecompostition to calculate a set of linear composites of the original variables
What does principal components analysis do?
It repackages the variance from the correlation matrix into a set of components, through the process of eigendecompostion
What is the first component?
It is the linear combination that accounts for the most possible variance
What are the second and subsequent components?
Second component accounts for second largest amount of variance after the variance accounted for by the first is removed
- third accounts for third largest etc
What does each component account for?
Each component accounts for as much remaining variance as possible
If variables are closely related, what number of correlations do they have, and how do we represent them?
If variables are closely related, they have large correlations, then we can represent them by fewer composites
If variables are not very closely related, what number of correlations do they have, and how do we represent them?
If variables are not very closely related, they have small correlations, then we will need more composites to adequately represent them.
If variables are entirely uncorrelated, how many components do we need?
We will need as many components as there were variables in the original correlation matrix
What is eigendecomposition?
It is a transformation of the correlation matrix to re-express it in terms of eigenvalues and eigenvectors
How many eigenvectors and eigenvalues do you have for each component?
There is one eigenvector and one eigenvalue for each component
What are eigenvalues?
Eigenvalues are a measure of the size of the variance packaged into a component
What do larger eigenvalues mean?
They mean that the component accounts for a large proportion of the variance
What do eigenvectors provide information on?
They provide information on the relationship of each variable to each component
What are eigenvectors?
They are sets of weights (one weight per variable in original correlation matrix)
e.g., if we had 5 variables each eigenvector would contain 5 weights
What will the some of the eigenvalues equal?
The sum of the eigenvalues will equal the number of variables in the data set
What is the covariance of an item with itself?
The covariance of an item with itself is 1
When you add up the covariance of items, what do you get?
Adding these up = total variance
What does a full eigendecomposition account for?
It will account for all variance distributed across eigenvalues so the sum of the eigenvalues must = 1
We use eigenvectors to think about the nature of components. To do so, what do we do?
We convert eigenvectors to PCA loadings
What does a PCA loading give?
A PCA loading gives the strength of the relationship between the item and the component
What is the range of PCA loadings?
Range from -1 to 1
In a PCA loading, what does a higher absolute value indicate?
The higher the absolute value, the stronger the relationship
What will the sum of squared loadings for any variable on all components equal?
The sum of square loadings for any variable on all components will equal 1
- that is all the variance in the item is explained by the full decomposition
Where does dimension reduction come from?
Comes from keeping only the largest components
What can our decisions on how many components to keep be guided by?
- set a amount of variance you wish to account for
- scree plot
- minimum average partial test (MAP)
- parallel analysis
What is the simplest method we can use to select a number of components?
Simply state a minimum variance we wish to account for
(We then select the number of components above this value)
What is a scree plot based on?
Based on plotting the eigenvalues
What are you looking for in a scree plot?
Looking for a sudden change of slope
What is the scree plot assumed to show?
It is assumed to potentially reflect the point at which components become substantively unimportant
(The points should drop, as variance decreases across the components)
In a scree plot, what is inferred as the slope flattens?
As the slope flattens, each subsequent component is not explaining much additional variance
On a scree plot, what is on the x-axis?
The component number
On a scree plot, what is on the y-axis?
The eigenvalue for each component
How do we decide what components to keep using a scree plot?
Keep the components with eigenvalues above a kink in the plot
What does the minimum average partial (MAP) test do?
MAP extracts components iteratively from the correlation matrix
What is the scree plot assumed to show?
It is assumed to potentially reflect the point at which components become substantively unimportant
(the points should drop, as variance decreases across the components)
What is the trend we see with MAP values?
At first this quantity goes down with each component extracted but then it starts to increase again
What components does MAP keep?
MAP keeps the components from point at which the average squared partial correlation is at its smallest (point before there is an increase)
How do we obtain results of the MAP test?
using vss() function from psych package
What is parallel analysis?
Parallel analysis simulates datasets with same number of participants and variables but no correlations
What does parallel analysis compute?
It computes an eigen-decompostion for the simulated datasets
What does parallel analysis compare?
It compares the average eigenvalue across the simulated datasets for each component
What happens if a real eigenvalue exceeds the corresponding average eigenvalue from the simulated datasets?
It is retained
How do we conduct parallel analysis in R?
fa.parallel() function in psych package
What is a limitation of scree plots?
Scree plots are subjective and may have multiple or no obvious kinks
What is a limitation of parallel analysis?
Parallel analysis sometimes suggests too many components (over-extraction)
What is a limitation of MAP?
MAP sometimes suggests too few components (under-extraction)
What should you do if your MAP and parallel analysis disagree?
If the two tests disagree with each other, think about parallel analysis as an absolute maximum, and MAP a minimum value -> therefore you have a range in which optimum answer is probably within
How are component loadings calculated and how can they be interpreted?
Component loadings are calculated from the values in the eigenvectors and they can be interpreted as the correlations between variables and components
In a component loading matrix, when looking at the output what are SS loadings?
They are the eigenvalues
What does a good PCA solution explain?
It explains the variance of the original correlation matrix in as few components as possible