PCA Flashcards
(30 cards)
What is the primary purpose of PCA?
To reduce dimensionality while preserving as much variance as possible.
What does PCA transform data into?
A new coordinate system aligned with directions of maximum variance.
What is the name of the directions found by PCA?
Principal Components.
Why do we use dimensionality reduction?
To compress data, remove redundancy, visualize, and denoise.
What shape does the covariance matrix describe?
The geometric shape of the data cloud in feature space.
What does the diagonal of a covariance matrix represent?
The variance of each individual feature.
What do off-diagonal entries in a covariance matrix represent?
The covariance between pairs of features.
What does it mean for data to be ‘white noise’?
It is uncorrelated, has zero mean, and unit variance.
What is the goal of whitening?
To transform data so its covariance matrix becomes the identity matrix.
What is the formula for the multivariate Gaussian distribution?
P(x) = (1 / sqrt((2π)^D |Σ|)) * exp(-0.5 * (x - μ)^T Σ⁻¹ (x - μ))
What does an eigenvector of Σ represent in PCA?
A principal direction of variance in the data.
What does the corresponding eigenvalue represent?
The amount of variance captured in that principal direction.
What is the first step of PCA?
Center the data by subtracting the mean.
How is the covariance matrix computed after centering?
Σ = (1 / (n - 1)) * BᵀB
What is B in the PCA algorithm?
The mean-centered data matrix (X - mean).
What does projecting data onto eigenvectors achieve?
Transforms data into a decorrelated space with ranked variance.
What does whitening do in PCA?
Removes correlations and scales components to unit variance.
What matrix operation is used to whiten data?
Multiply by D⁻¹ᐟ² where D contains the eigenvalues.
What do principal component ‘loadings’ mean?
They are eigenvectors scaled by the variance (eigenvalue).
What does PCA seek to maximize when choosing projection directions?
The variance of the projected data.
What is the rank of the PCA-transformed dataset if we keep k components?
k
What shape is the projection matrix if we reduce to k dimensions?
V_k ∈ ℝ^{d × k}, where d is the original dimension.
What is an advantage of using PCA before a classifier?
It can reduce noise and remove multicollinearity.
What happens if you remove PCs with low variance in images?
You may remove noise while preserving structure.