# Chapter 3 - Classification Flashcards

What performance measures can we use for a classifier?

1) Cross-validation

2) confusion matrix

3) precision and recall

4) The ROC curve

What do we mean by imbalanced classification problem ?

we have two classes we need to identify (for example, terrorists and not terrorists) with one category representing the overwhelming majority of the data points.

What is the difference between a binary classifier and a multiclass classifier ?

The binary model, classify object into 2 categories. The multiclass model classify object into multiple categories. Note that in these model, the object is part of 1 and only 1 class.

When talking about multiclass classifier, what is the difference between a one-versus-the-rest (OvR) strategy and a one-versus-one (OvO) strategy?

For example, let say we have 10 class to predict. We could use 10 binary classifiers predicting if each object is part of a particular class or not and take the highest probability, that (OvR). If instead, we train the model to predict if the object is in class 1 versus 2 and then 1 versus 3,…, until we get to 9 versus 10, then that (OvO).

What do the row and column represent in a confusion matrix

The row represent the actual classes, while the columns represent predicted classes.

What do we mean by multilabel classification ?

In multilabel classification an object could be part of multiple classes at the same time. For example, let say a classifier is trained to recognize three face, Alice, Bob and Charlie. Then when the classifier is shown a picture of Alice and Charlie, it should output [1,0,1], meaning Alice yes, Bob no, and Charlie yes.