Evaluation Flashcards

1
Q

What is formal definition of overfitting?

A

A predictor F is overfit if we can find another predictor F’ where:

  • Etrain(F’) > Etrain(F)
  • Egen(F’) gen(F)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is formal definition of underfitting?

A

Can find another predictor F’ with smaller Etrain and Egen

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How is Etrain (training error computed)?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How is Egen calculated (generalization error)?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How can we estimate Egen?

A

Set aside test set and compute Etest (same way as Etrain)

lim Etest = Egen as the size of the test set -> infinity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How can you compute the confidence interval for Egen from Etest

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What do we use training/validation/testing sets

for?

A
  • Training set: construct classifier
  • Validation set: pick algorithm + tune hyper parameters
  • Testing set: estimate future error rate
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How does cross-validation work?

A
  • Randomly split data into k sets
  • Test on one portion (train on k-1 others)
  • Average error over all k folds
  • Final classifier is trained on all date
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is leave-one-out?

A

Cross validation where k = # of training instances

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the problem with leave-one-out validation?

A

Classes not balanced

Testing { 1 of A, 0 of B } vs training: { n/2 of B, n/(2-1) of A }

We would always predict B (most frequent), but we will always be wrong

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What does stratification do?

A

Keeps class labels balanced across training/testing sets

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How do you do stratification?

A
  • Split instances by class
  • Split class into K parts
  • Assemple ith fold by combining 1 part from each path
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is true positive?

A

Classifier predicts positive, and it is positive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is true negative?

A

Classifier predicts negative, and it is negative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is false positive?

A

Classifier predicts positive, but they are negitive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is false negative?

A

Classifier predicts negative, but it is actuall positive

17
Q

What is the definition of classification error?

A
18
Q

What is the defintion of accuracy?

A
19
Q

What is the problem with classification error / accuracy?

A

Misleading when classes are unbalances

  • Predict earthquake: unlikely so always predict no
  • Decide if webpage is relevent: 99.999% are not so retreive nothing
20
Q

What is the definition of False Alarm Rate?

A

FP / (FP + TN)

21
Q

What is the definition of Miss rate?

A

FN / (TP + FN)

22
Q

What is the definition of Recall?

A

TP / (TP + FN)

23
Q

What is the defintion of Precision?

A

TP / (TP + FP)

24
Q

What is the problem with False alarm rate / miss rate / recall / precision?

A

Trivial to get 100% or 0% individually, must report them in pairs

25
Q

What evaluation measure would we use for event detection?

A

Cost = CFP * FP + CFN * FN

e.g. cost of evacuating with no earthquake vs cost of staying with earthquake

26
Q

What is the definition of F-measure?

A

2 / (1 / Recall + 1 / Precision)

Simular to accuracy but without TN

27
Q

What is a ROC curve?

A

Plot of TP vs FP as threshold varies

28
Q

What does a perfect and random classifer look like on an ROC curve?

A
29
Q

Whats are some problems with mean squared error?

A
  • Very sensitive to outliers (because of the squaring)
  • Sensitive to mean / scale (mean value might have lower MSE than a model which captures the pattern but the mean is off)
30
Q

What is the mean absolute error (MAE)?

A
31
Q

What is Median absolute deviation (MAD)?

A

med { |f(xi) - yi| }

32
Q

Whats the pros/cons of median absolute deviation (MAD)?

A

completly ignores outliers but cant take derivative

33
Q

What is definition of correlation coefficient?

A
34
Q

What does correlation coeffient capture?

A

Realtive ordering - usefull for ranking tasks