week 7 -chatgpt Flashcards

Question 1

Q

What is the difference between feature selection and feature extraction?

Answer

A

Feature selection chooses a subset of existing features, while feature extraction creates new features from original data to improve classification.

Question 2

Q

What is the goal of Principal Component Analysis (PCA)?

Answer

A

To reduce dimensionality by projecting data onto directions (principal components) that maximize variance.

Question 3

Q

Is PCA a supervised or unsupervised method?

Answer

A

Unsupervised — it does not use class labels, only the variance structure of the data.

Question 4

Q

What does Linear Discriminant Analysis (LDA) optimize?

Answer

A

It maximizes the ratio of between-class variance to within-class variance to achieve better class separation.

Question 5

Q

How is LDA different from PCA?

Answer

A

LDA is supervised and focuses on class separation, while PCA is unsupervised and focuses on variance.

Question 6

Q

What does Independent Component Analysis (ICA) aim to find?

Answer

A

A representation of data where components are statistically independent, not just uncorrelated like PCA.

Question 7

Q

When does ICA give similar results to PCA?

Answer

A

When the data is Gaussian, since uncorrelated implies independence for Gaussian distributions.

Question 8

Q

What is the idea behind random projections for feature extraction?

Answer

A

To map data into a higher-dimensional space using random weights and a nonlinearity, where it may become linearly separable.

Question 9

Q

What is the main idea of sparse coding?

Answer

A

To represent input data using a small number of active features from a larger dictionary, encouraging interpretability and efficiency.

Question 10

Q

How is sparse coding different from PCA or ICA?

Answer

A

Sparse coding encourages sparsity in the feature vector, while PCA/ICA focus on orthogonal or independent directions.

Question 11

Q

Why might you use feature extraction instead of manual feature engineering?

Answer

A

To automatically discover transformations that improve model performance without requiring domain-specific knowledge.

(11 cards)