Path3.Mod1.h - Automated Machine Learning - Chart Analysis Flashcards

Question 1

Q

Good vs Bad Confusion Matrix

Answer

A

A good model produces a Confusion Matrix puts most all the samples along the diagonal from top left to bottom right. The more that fall out of the diagonal and the more distributed throughout the matrix, the worse it is:

Question 2

Q

Good vs Bad ROC Curve

Answer

A

A good model approaches the top left corner as close as possible, reaching 100% TPR (True Positive Rate) and 0% FPR (False Positive Rate). Compares to a Random Model that follows f(x) = y.

Question 3

Q

Good vs Bad Precision-Recall Curve (what the Model should maintain)

Answer

A

A good model maintains a Precision = 1 and Recall = 100%, with the model approaching the top right corner as closely as possible.

Question 4

Q

Good vs Bad Cumulative Gains Curve (What the chart indicates w.r.t. Responses)

Answer

A

Indicates the degree of positive response (y-axis) vs attempted responses (x-axis). A good model will rank positive above negative samples, making two straight segments, with slopes 1/x from the origin to (x, 1). Second line is from (x, 1) to (1, 1). If classified correctly, cumulative gains goes to 100% from the first x% of samples considered.

Put it another way, this tells you if your Model fails fast.

Question 5

Q

Good vs Bad Lift Curve

Answer

A

The Lift Curve shows how many times better a Model performs compared to a random model. A good model will be higher on the chart and farther from the x-axis.

Question 6

Q

Good vs Bad Calibration Curve

Answer

A

Depicts the models ability to assign confidence to predictions, not to make predictions. A perfectly calibrated model will follow y=x where the model perfectly predicts the probability that samples belong to each class. Overconfident models will form a backward “S” across that line, while an underconfident one will form an “S” across that line.

Question 7

Q

Explain Dataset Cohorts

Answer

A

Cohorts are groups of data that share a common characteristic or experience within a specific time period

You can slice up your data into dataset cohorts to analyze your model’s performance and explanations across subgroups. Doing so gives you a sense of why possible errors are happening in one group versus another.

Question 8

Q

ES CC DGT

Explain Dataset Explorer

Answer

A

Provides functionality to extract insights into your data vs training results:
- Explore statistics by selecting different filters along the X, Y, and color axes to display data along different dimensions
- Create cohorts (that button) to analyze statistics with filters such as predicted outcome, dataset features and error groups.
- View different graph types to better visualize results (use the gear icon in the upper right-hand corner)

Question 9

Q

Explain Aggregate Feature Importance

Answer

A

Explore top-k important features that impact model predictions (also known as Global Explanation).
- Use slider to show descending feature importance values
- Select up to three cohorts to see their feature importance values side by side
- Select any feature bar in the graph to see how values of the selected feature impact model prediction

Aggregate Feature Importance

Question 10

Q

Explain Individual Feature Importance, What-If and Individual Conditional Expectation (ICE)

Answer

A

Drill into individual datapoints and their individual feature importances. You can load the individual feature importance plot for any data point by clicking on one in the main scatter plot or selecting a specific datapoint in the panel wizard on the right.

IFI: Shows the top-k important features for an individual prediction. Helps illustrate the local behavior of the underlying model on a specific data point.
What-If: Allows changes to feature values of a selected data point to observe resulting changes to prediction value by generating a hypothetical datapoint with the new feature values.
ICE: Same as What-If but instead uses a selected real data point.

Individual Predictions

Path3.Mod1.h - Automated Machine Learning - Chart Analysis Flashcards

(10 cards)