Classification Flashcards

week 7

1
Q

classification probabilities

A

P(class|features) we’re essentially trying to understand the likelihood of an observation belonging to a particular class given its features.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Approaches to Classifcation

A

Generative classifiers:.
Generative classifiers try to understand how data is generated, modeling both the features and the classes together.
These generate probabilities P(class|predictors) by first estimating other distributions.
Rely on statistical theory like Bayes theorem.

Discriminative classifiers:
Discriminative classifiers focus on predicting the class directly based on the observed features, without necessarily understanding the underlying data generation process.
Estimate P(class|predictors) directly.
Also referred to as conditional classifiers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

prior probability for a class

A

The “prior probability for a class” refers to the probability of a particular class occurring before considering any evidence or features. It represents our initial belief or assumption about the likelihood of each class before we observe any data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Bayes’ Theorem

A

Bayes’ Theorem to determine the probability of a class given a set of features (or a feature vector).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Posterior Probability for a Class:

A

P(j∣x) represents the probability of class j given a feature vector x. This is what we want to find out—it’s like the updated probability of each class after we’ve observed the features.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

misclassification rate

A

The performance of a classifier is usually measured by its misclassification rate. The misclassification rate is the proportion of observations assigned to the wrong class.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Linear Discriminant Analysis

A

Linear Discriminant Analysis
- Linear Discriminant Analysis is often abbreviated to LDA.
- LDA applicable when all the features are quantifiable.
- We assume that fj is a (joint) normal probability distribution.
- In addition, we assume that the covariance matrix is the same from class to class.
- The classes are differentiated by locations of their means.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Kernel Discriminant Analysis (KDA):

A

KDA extends LDA by allowing for more complex, nonlinear decision boundaries between classes. It achieves this by mapping the data into a higher-dimensional space using a kernel function. KDA essentially “lifts” the data into a higher-dimensional space where the classes might be more easily separable and then applies LDA in this new space.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly