Principal Component Analysis Flashcards

1
Q

General goal

A

To get rid of redundancy and reduce the number of features of the data, maintaining only the ones that account the most in term of variance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Methodology

A

the objective is to find a vector z of features, much smaller than x, with maximum variance
- this goal is obtained by the singular value decomposition of the variance of x

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Singular value decomposition: description

A
  • It’s a linear algebra tool
  • it can be used to reorder the feature of the data according to the variance of the fluctuations
Given the data matrix X app R^N*d
X = U Γ V'
U app R^N*d has orthonormal columns
Γ app R^d*d is a non-negative diagonal matrix
V app R^d*d is orthogonal

In general, Γ contains the singular values of X, in decreasing order:
Γii = γi
γ1 > = γ2 > = … > = γd > = 0

U contains the left singular vectors of X
V contains the right singular vectors of X

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

PCA algorithm

A

Input: data matrix X and k > = 1

1) compute the SVD of X
[U, Γ, V] = svd(X)

2) Let Vk = [v1, v2, …, vk] be the first k columns of V

3) The PCA feature matrix and the reconstructed data are, respectively
Z = X Vk
X^ = X Vk Vk’

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How to select k

A

Plotting the singular values of X in decresing order, after some k the values are near zero. This is the k to select.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly