L12 - Dimensionality Reduction Flashcards

1
Q

What is meant by dimensionality reduction?

A

The reduction of feature count within a data set.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What 3 ways does dimensionality reduction improve / enhance the creation and running of ML models?

A

Saves time, saves money, removed irrelevant data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the 2 methods of dimensionality reduction? Define each…

A
  1. Feature extraction -> Extract useful combinations of features from the data.
  2. Feature selection -> Analyse all features of the data to establish the relevant ones.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the 3 methods for feature selection?

A
  1. Filter Methods
  2. Wrapper Methods
  3. Embedded Methods
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Explain the Filter Method…

A
  • Method of feature selection for dimensionality reduction
    1 -> Bring features to the same scale through normalisation or standardisation
    2 -> Choose some variance threshold
    3 -> Calculate the variance of each feature, dropping ones that are below the threshold
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Explain the Wrapper Method…

A
  1. Method of feature selection for dimensionality reduction
  2. Can either be conducted via Forward Search or Recursive Feature Elimination
  3. Both FS and RFE conduct a battle royale process to establish best features.
  4. Both stop when there are N models each with N best features
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Explain how Forward Search works in Wrapper Method of Feature Selection…

A
  1. Create N models with 1 feature each
  2. Find the best feature E.g Feature 3
  3. Create N-1 models each with previous best feature (F3) + another feature E.g Model1(F3,F1), Model2(F3,F2) etc…
  4. Repeat until we have a models with N features and can choose the best one
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Explain how Recursive Feature Elimination works in Wrapper Method of Feature Selection…

A
  1. Reverse of Forward Search
  2. Start with N-1 models each with N-1 features
  3. Repeatedly remove the single worst feature from each model
  4. Result in final best model with M features
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Explain the Embedded Method

A
  1. Method of feature selection for dimensionality reduction
  2. Use decision trees to establish the best features
  3. The use random forests to aggregate the result of the decision trees
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is a Random Forest?

A

An aggregation of decision trees

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the 2 types of methods for Feature Extraction?

A
  1. Linear
  2. Non-Linear
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the main Linear method for feature extraction? Explain it…

A
  1. Principal Component Analysis
  2. Find an orthogonal coordinate transformation such that every new coordinate is maximally effective
  3. This creates new N variables, named Principal Components
  4. Principal Components are linear combinations of the original coordinates
  5. The orthogonal coordinate with the most variation is the most informative
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the worst case scenario of PCA?

A
  1. When all variables are equally important, but are uncorrelated.
  2. This provides us with no information.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the steps of PCA?

A
  1. Generate the covariance matrix from the dataset
  2. Diagonalise the covariance matrix
  3. Multiply XV
  4. Take first K principal components with the largest eigenvalues.
  5. This gives us a K-dimensional representation of the data, having extracted the K most important features.
  6. The dimensionality reduction comes from the removal of least important principal components
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What do the Eigenvalues represent in PCA?

A

The variance capture by each principal component.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the Input-Output into PCA’s?

A

Input -> High-D data
Output -> Low-D data

17
Q

What are the 2 main Non-linear methods for feature extraction?

A
  1. t-SNE
  2. UMAP
18
Q

Explain t-SNE

A

A non-linear method for feature extraction

1 - Calculate the distribution of distances across the N points and call this D

2 - Scatter N points randomly in 2 or 3 dimensions

3 - Move the N points around until distance distribution resembles D

19
Q

What are some issues with t-SNE?

A
  • Faraway points are meaning less
  • Poor scaling due to high memory usage
20
Q

Explain UMAP

A
  1. A non-linear method for feature extraction
  2. Runs faster and uses less memory than t-SNE
21
Q

What are some issues with both t-SNE and UMAP?

A
  1. Hyperparameter dependence
  2. Cluster sizes and distance between clusters mean nothing
  3. X and Y axis are almost impossible to interpret