Evaluating Models Flashcards

Question 1

Q

You are working as a data scientist for a pharmaceutical company. You are collaborating with other teammates to create a machine learning model to classify certain types of diseases on image exams. The company wants to prioritize the assertiveness rate of positive cases, even if they have to wrongly return false negatives. Which type of metric would you use to optimize the underlying model?

Answer

A

Precision - the company prefers to have a higher probability to be right on positive outcomes at the cost of wrongly classifying some positive cases as negative. Technically, they prefer to increase precision at the cost of reducing recall.

Question 2

Q

You are working as a data scientist for a pharmaceutical company. You are collaborating with other teammates to create a machine learning model to classify certain types of diseases on image exams. The company wants to prioritize the capture of positive cases, even if they have to wrongly return false positives. Which type of metric would you use to optimize the underlying model?

Answer

A

Recall - the company prefers to find most of the positive cases at the cost of wrongly classifying some negative cases as positive. Technically, they prefer to increase recall at the cost of reducing precision.

Question 3

Q

You are working in a fraud identification system, where one of the components is a classification model. You want to check the model’s performance. Which of the following metrics could be used and why?

Answer

A

The F1 score. Since fraudulent system datasets are naturally unbalanced, this metric is good to take into consideration the assertiveness of both positive and negative classes.

Question 4

Q

You are building a machine learning model to predict house prices. You have approached the problem as a regression model. Which of the metrics are not applicable for regression models?

Answer

A

Recall and Precision

Question 5

Q

Which of the metric helps us to penalize bigger errors on regression models?

Answer

A

RMSE computes the squared error of each prediction. Then, it takes the squared root of the MSE. By computing the squared error, RMSE will penalize bigger errors over smaller errors.

Question 6

Q

You are working as a data scientist for a financial services company and you have created a regression model to predict credit utilization. If you decide to include more features in the model, what will happen to R-squared and Adjusted R-squared?

Answer

A

R-squared will increase since the extra information will help the model to capture more variance in the data. However, Adjusted R-squared can either increase or decrease, depending on the gain of adding the extra variable.

Question 7

Q

Which of the metric will compute the percentage of errors instead of absolute errors?

Answer

A

MAPE is applicable for regression models and it will compute the error as a percentage number.

Question 8

Q

You are the lead data scientist of the company. Your team wants to optimize a model that is no longer performing well in production. The team has decided to use grid search to retrain the hyperparameters; however, the process is taking a long time and does not complete. Which approach could you take to speed up the process of tuning and still maximize your chances of finding a better model?

Answer

A

Use Bayesian optimization instead of grid search. Bayesian optimization can focus on the most important search space, potentially reducing the time for processing and increasing your chances of finding the best model.

Question 9

Q

You are using grid search to tune a machine learning model. During the tuning process, you obtain good performance metrics. However, when you execute the model in production, the model performance is not acceptable. You have to troubleshoot the problem. What are valid reasons for this issue?

Answer

A

You are tuning and evaluating the model in the training data, which is causing overfitting.

The production data does not have the same distribution as the training data.

Question 10

Q

You are working for a global financial company. Your team has created a binary classification model to identify fraudulent transactions. The model has been put into production and is automatically flagging fraudulent transactions and sending them for further screening. The operation team is complaining that this model is blocking too many transactions and they would prefer to flag a smaller number of transactions. According to the preceding scenario, what is the expectation of the operation team?

Answer

A

We always have to match model usage with business goals and capacity. In this scenario, the model is flagging a lot of potentially fraudulent transactions, but there isn’t a big enough human workforce to evaluate all of those blocked transactions. Furthermore, what makes more sense is “calibrating” the model to the real business scenario, where it will flag fewer (but more likely) fraudulent cases for further screening.

Increase recall at the expense of precision.

Question 11

Q

Accuracy

Answer

A

(true positives + true negatives ) / total cases

Question 12

Q

Recall (sensitivity)

Answer

A

true positives / (true positives + false negatives) (what percent of trues did I get correct)

Question 13

Q

Precision

Answer

A

true positives / (true positives + false positives) (what percent of my True predictions were correct)

Brainscape's Knowledge GenomeTM

Evaluating Models Flashcards

Brainscape's Knowledge Genome^TM