Data Mining - Chapter 5 (Performance Measures) Flashcards

1
Q

Why do we need to evaluate our models?

A
  • Allows you to convince others that your work is meaningful
  • Without strong evaluation, your idea is likely to be rejected, or your code would not be deployed
  • Emperical evaluation helps guide meaningful research and development directions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a benefit of having a large training data set?

A

The larger the training data, the better the classifier

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a benefit of having a large test data set?

A

The larger the test data, the more accurate the error estimate.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What do errors based on the training set tell us?

A

They give us information about the fit of the model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What do errors based on the validation/testing set tell us?

A

They measure the model’s ability to predict new data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What three types of outcomes exist in prediction through supervised learning

A
  1. Predicted numerical value
  2. Predicted class membership
  3. Propensity - probability of class membership
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What do we focus on when we are evaluating predictive performance (with numerical variables)?

A

We measure accuracy by using the prediction errors on the validation/test set.

All the measures are based on the prediction error. For a single record this is computed by subtractig the predicted outcome value from the actual outcome value:

ei = Yi - Yihat

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Which five accuracy measures are there for models that predict numerical values?

A
  1. Mean absolute error (MAE)
  2. Mean error
  3. Mean percentage error (MPE)
  4. Mean absolute percentage error (MAPE)
  5. Root mean squared error

–> Check slides for the formula’s.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the benefit of the mean percentage error(MPE) ?

A

It takes into account the direction of the error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What do you need to take into account when using any of these measures using the mean?

A

The measures are affected by outliers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a Lift chart?

A

A graphical way to assess predictive performance. You use this when your goal is to search for a subset of records that gives the highest cumalative predicted values. (ranking)

-> The predictive performance is compared against a baseline model without predictors (average).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is called the ‘lift’?

A

The ratio of model gains to naive benchmark gains.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What do we do when we are evaluating the performance of predicted class membership (classifiers)?

A

We are looking how well are model is doing, or comparing multiple models based on their accuracy in classifying records into classes.

We calculate the accuracy by subtracting the misclassification error from 1.

This is mainly done by using a confusion/classification matrix.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the confusion/classification matrix?

A

It is a matrix in which the predicted classes are compared to the actual classes. The actual classes are portrayed on the y-axis and the predicted classes on the x-axis.

The matrix will contain numbers for:

  • True positive
  • True negative
  • False positive
  • False negative
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a Type I error?

A

A false positive.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is a Type II error?

A

A false negative.

17
Q

How do you compute the accuracy for classifiers?

A

There are two ways:

(TP + TN) / N

or

1 - ((FP + FN / N))

18
Q

On which dataset do we base our confusion/classification matrix?

A

On the testing (book:validation) data set.

-> Otherwise we do not get an honest estimate of the misclassificaiton rate for new data.

19
Q

What is the ROC-curve?

A

It is a graph that plots the true positive on the y-axis and the false positive on the x-axis.

  • > Based on the graph it forms you can compare it to the random function (straight linear line from (0,0) to (1,1) and see where the model performs better than the random function.
  • > If you put two models in you can compare the two models and see which one performs better in which scenario.
20
Q

What are the three extreme situations in the ROC-curve?

A

(TP, FP)

(0,0) - All records are classified as either true negative or false negative.

(1,1) - All records are classified as either true positive or false positive.

(1,0) - Ideal situation.

21
Q

What is the limitation of accuracy?

A

If you have a dataset with two classes and class I has 9990 records and class II has 10 records.

If the model predicts everything in class I, it still has an accuracy of 99.9%, but it does not classify anything in class II.

22
Q

What is the cost matrix and how can you compute the cost of a model based on that?

A

A matrix containing per cell (of TP, FP, TN, FN) what the cost is allocated to it.

-> (cost x number in each cell) + (cost x number in each cell) and so on for all cells.

23
Q

Outside of accuracy, which three measures can you compute for a model that focusses on classification?

A
1. Precision
The ability of the model to correctly detect class items
  1. Recall
    The ability of the model to find all of the items of the class
  2. F-measure (F1-measure)
    Taking precision and recall into account
24
Q

What is the formula of precision?

A

(True positives) / (True positives + false positives)

25
Q

What is the formula of recall?

A

True positives / (true positives + false positives)

26
Q

What is the formula of F-measure?

A

2 / ((1/R)+(1/P))

27
Q

Why is determining recall sometimes difficult?

A

The total number of items/records that belong to a particular class is sometimes not available.

28
Q

How can we use those measures in Python?

A

You need to define Y_pred = model.predict(X_test)

You can do the confusion matrix or those measures with Y_test as first variable and Y_pred as second variable.