How to perform LDA, PCA, SVD Flashcards

1
Q

WHAT IS LATENT DIRICHLET ALLOCATION? P362

A

It’s a dimensionality reduction technique for text documents.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

WHAT DOES THE LDA MODEL DO TO SEPARATE SAMPLES BY THEIR CLASSES? P362

A

It tries to find a linear combination of input variables that achieves the maximum separation for samples between classes and the minimum separation of samples within each class. E.g. low variance in the group, high variance between groups.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

WHY IS IT BETTER TO STANDARDIZE DATA BEFORE USING LDA FOR DIMENSIONALITY REDUCTION? P362

A

LDA for multiclass classification is typically implemented using the tools from linear algebra. As such, it’s good practice to perhaps standardize the data prior to fitting an LDA model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

HOW CAN WE USE LDA FOR DIMENSIONALITY REDUCTION? P363

A

Using a pipeline, we can in one step set the LDA and its n_components and in the next step, we can feed the output of this transformation to a model. Code P364

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

WHAT IS THE RANGE OF POSSIBLE VALUES FOR N_COMPONENTS PARAMETER OF LDA? P364

A

<= Min (n_classes – 1 , n_features)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

EXTERNAL Q: DOES PCA NEED STANDARDIZATION?

A

Yes because it’s a technique that comes from the field of linear algebra

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Dimensionality reduction is often called ____ and the algorithms used are referred to as ____. P370

A

Feature projection, Projection methods

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

WHICH TOOLS FROM LINEAR ALGEBRA ARE USED FOR PCA? P370

A

Matrix decomposition like an Eigendecomposition or singular value decomposition (SVD).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

DOES SVD USE TECHNIQUES FROM THE FIELD OF LINEAR ALGEBRA? P378

A

Yes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

WHAT IS SPARSE DATA? P378

A

Sparse data refers to rows of data where many of the values are zero.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

WHAT ARE SOME EXAMPLES OF SPARSE DATA APPROPRIATE FOR APPLYING SVD FOR DIMENSIONALITY REDUCTION? P378

A
  • Recommender systems
  • User-song listen counts
  • Bag of words counts
  • TF/IDF
  • One hot encoding
  • Text classification
  • User-movie ratings
  • Customer-product purchases
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

WHAT IS SCIKIT-LEARN CLASS FOR SVD? P378

A

TruncatedSVD

How well did you know this?
1
Not at all
2
3
4
5
Perfectly