Evaluate Classifiers Flashcards

(26 cards)

1
Q

Why would you need evaluate classifiers?

A

To help choose optimal method and parameters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are 3 causes to overfitting?

A
  1. Too many variables
  2. Excessive model complexity
  3. Data leakage
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the consequence of launching an overfit model?

A

Deployed model will not generalized

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the formula for accuracy rate?

A

Accuracy rate = (# of correct classification) / (# of records in datatset)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

when is accuracy alone is not a sufficient metric?

A

For imbalance classification problems

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Along the lines of confusion matrix, what is the formula for accuracy?

A

Accuracy = (TP+TN) / (P + N)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the 2 step process for cutoff value for classification?

A
  1. Compute probability of belonging to positive class
  2. Compare cutoff value and classify
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Along the lines of confusion matrix, what is the formula for precision?

A

Precision =
(TP) / (TP + FP)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Along the lines of confusion matrix, what is the formula for False Discovery?

A

FDR = 1 - Precision

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Along the lines of confusion matrix, what is the formula for False Omission Rate?

A

FOR
= (FN) / (TN + FN)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Along the lines of confusion matrix, what is the formula for Recall?

A

Recall = (TP) / (TP + FN)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Along the lines of confusion matrix, what is the formula for False Negative Rate?

A

False Negative Rate = 1 - Recall

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Along the lines of confusion matrix, what is the formula for false positive rate?

A

FPR = (FP) / (FP +TN)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What does accuracy measure overall?

A

Correctness

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What type of data is accuracy good to use with?

A

Balanced datasets

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

When is accuracy misleading?

A

When working with imbalance classes

17
Q

What does precision focus on?

A

Positive prediction

18
Q

When false positives are costly, what is a good metric to use?

19
Q

What does the ROC Curve & AUC evaluate?

A

Model performance at different thresholds

20
Q

What does a higher AUC mean?

A

Better discrimination between classes

21
Q

What does the ROC curve and AUC help find?

A

Optimal balance between true positive and false positive

22
Q

What does the lift & Gain chart help asses?

A

How well the model ranks and prioritizes high value cases

23
Q

What does the lift show?

A

Improvement over random selection

24
Q

What does the gains help visualize?

A

How well the model capturees true positives early

25
Which evaluation metric is good for fraud detection, spam filter?
Precision
26
Which evaluation is good for medical diagnosis and security alerts?
Recall