Data Mining - Chapter 5 (Performance Measures) Flashcards by Joost Kok

Why do we need to evaluate our models?

Allows you to convince others that your work is meaningful
Without strong evaluation, your idea is likely to be rejected, or your code would not be deployed
Emperical evaluation helps guide meaningful research and development directions

How well did you know this?

Not at all

Perfectly

What is a benefit of having a large training data set?

The larger the training data, the better the classifier

How well did you know this?

Not at all

Perfectly

What is a benefit of having a large test data set?

The larger the test data, the more accurate the error estimate.

How well did you know this?

Not at all

Perfectly

What do errors based on the training set tell us?

They give us information about the fit of the model

How well did you know this?

Not at all

Perfectly

What do errors based on the validation/testing set tell us?

They measure the model’s ability to predict new data

How well did you know this?

Not at all

Perfectly

What three types of outcomes exist in prediction through supervised learning

Predicted numerical value
Predicted class membership
Propensity - probability of class membership

How well did you know this?

Not at all

Perfectly

What do we focus on when we are evaluating predictive performance (with numerical variables)?

We measure accuracy by using the prediction errors on the validation/test set.

All the measures are based on the prediction error. For a single record this is computed by subtractig the predicted outcome value from the actual outcome value:

ei = Yi - Yihat

How well did you know this?

Not at all

Perfectly

Which five accuracy measures are there for models that predict numerical values?

Mean absolute error (MAE)
Mean error
Mean percentage error (MPE)
Mean absolute percentage error (MAPE)
Root mean squared error

–> Check slides for the formula’s.

How well did you know this?

Not at all

Perfectly

What is the benefit of the mean percentage error(MPE) ?

It takes into account the direction of the error

How well did you know this?

Not at all

Perfectly

What do you need to take into account when using any of these measures using the mean?

The measures are affected by outliers.

How well did you know this?

Not at all

Perfectly

What is a Lift chart?

A graphical way to assess predictive performance. You use this when your goal is to search for a subset of records that gives the highest cumalative predicted values. (ranking)

-> The predictive performance is compared against a baseline model without predictors (average).

How well did you know this?

Not at all

Perfectly

What is called the ‘lift’?

The ratio of model gains to naive benchmark gains.

How well did you know this?

Not at all

Perfectly

What do we do when we are evaluating the performance of predicted class membership (classifiers)?

We are looking how well are model is doing, or comparing multiple models based on their accuracy in classifying records into classes.

We calculate the accuracy by subtracting the misclassification error from 1.

This is mainly done by using a confusion/classification matrix.

How well did you know this?

Not at all

Perfectly

What is the confusion/classification matrix?

It is a matrix in which the predicted classes are compared to the actual classes. The actual classes are portrayed on the y-axis and the predicted classes on the x-axis.

The matrix will contain numbers for:

True positive
True negative
False positive
False negative

How well did you know this?

Not at all

Perfectly

What is a Type I error?

A false positive.

How well did you know this?

Not at all

Perfectly

What is a Type II error?

Study These Flashcards

A false negative.

How do you compute the accuracy for classifiers?

Study These Flashcards

There are two ways:

(TP + TN) / N

1 - ((FP + FN / N))

On which dataset do we base our confusion/classification matrix?

Study These Flashcards

On the testing (book:validation) data set.

-> Otherwise we do not get an honest estimate of the misclassificaiton rate for new data.

What is the ROC-curve?

Study These Flashcards

It is a graph that plots the true positive on the y-axis and the false positive on the x-axis.

> Based on the graph it forms you can compare it to the random function (straight linear line from (0,0) to (1,1) and see where the model performs better than the random function.
> If you put two models in you can compare the two models and see which one performs better in which scenario.

What are the three extreme situations in the ROC-curve?

Study These Flashcards

(TP, FP)

(0,0) - All records are classified as either true negative or false negative.

(1,1) - All records are classified as either true positive or false positive.

(1,0) - Ideal situation.

What is the limitation of accuracy?

Study These Flashcards

If you have a dataset with two classes and class I has 9990 records and class II has 10 records.

If the model predicts everything in class I, it still has an accuracy of 99.9%, but it does not classify anything in class II.

What is the cost matrix and how can you compute the cost of a model based on that?

Study These Flashcards

A matrix containing per cell (of TP, FP, TN, FN) what the cost is allocated to it.

-> (cost x number in each cell) + (cost x number in each cell) and so on for all cells.

Outside of accuracy, which three measures can you compute for a model that focusses on classification?

Study These Flashcards

1. Precision
The ability of the model to correctly detect class items

Recall
The ability of the model to find all of the items of the class
F-measure (F1-measure)
Taking precision and recall into account

What is the formula of precision?

Study These Flashcards

(True positives) / (True positives + false positives)

What is the formula of recall?

True positives / (true positives + false positives)

What is the formula of F-measure?

2 / ((1/R)+(1/P))

Why is determining recall sometimes difficult?

The total number of items/records that belong to a particular class is sometimes not available.

How can we use those measures in Python?

You need to define Y_pred = model.predict(X_test) You can do the confusion matrix or those measures with Y_test as first variable and Y_pred as second variable.

Data Mining - Chapter 5 (Performance Measures) Flashcards

(28 cards)