Path3.Mod1.h - Automated Machine Learning - Chart Analysis Flashcards

1
Q

Good vs Bad Confusion Matrix

A

A good model produces a Confusion Matrix puts most all the samples along the diagonal from top left to bottom right. The more that fall out of the diagonal and the more distributed throughout the matrix, the worse it is:

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Good vs Bad ROC Curve

A

A good model approaches the top left corner as close as possible, reaching 100% TPR (True Positive Rate) and 0% FPR (False Positive Rate). Compares to a Random Model that follows f(x) = y.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Good vs Bad Precision-Recall Curve (what the Model should maintain)

A

A good model maintains a Precision = 1 and Recall = 100%, with the model approaching the top right corner as closely as possible.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Good vs Bad Cumulative Gains Curve (What the chart indicates w.r.t. Responses)

A

Indicates the degree of positive response (y-axis) vs attempted responses (x-axis). A good model will rank positive above negative samples, making two straight segments, with slopes 1/x from the origin to (x, 1). Second line is from (x, 1) to (1, 1). If classified correctly, cumulative gains goes to 100% from the first x% of samples considered.

Put it another way, this tells you if your Model fails fast.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Good vs Bad Lift Curve

A

The Lift Curve shows how many times better a Model performs compared to a random model. A good model will be higher on the chart and farther from the x-axis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Good vs Bad Calibration Curve

A

Depicts the models ability to assign confidence to predictions, not to make predictions. A perfectly calibrated model will follow y=x where the model perfectly predicts the probability that samples belong to each class. Overconfident models will form a backward “S” across that line, while an underconfident one will form an “S” across that line.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Explain Dataset Cohorts

A

You can slice up your data into dataset cohorts to analyze your model’s performance and explanations across subgroups. Doing so gives you a sense of why possible errors are happening in one group versus another.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

ES CC DGT

Explain Dataset Explorer

A

Provides functionality to extract insights into your data vs training results:
- Explore statistics by selecting different filters along the X, Y, and color axes to display data along different dimensions
- Create cohorts (that button) to analyze statistics with filters such as predicted outcome, dataset features and error groups.
- View different graph types to better visualize results (use the gear icon in the upper right-hand corner)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Explain Aggregate Feature Importance

A

Explore top-k important features that impact model predictions (also known as Global Explanation).
- Use slider to show descending feature importance values
- Select up to three cohorts to see their feature importance values side by side
- Select any feature bar in the graph to see how values of the selected feature impact model prediction

Aggregate Feature Importance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Explain Individual Feature Importance, What-If and Individual Conditional Expectation (ICE)

A

Drill into individual datapoints and their individual feature importances. You can load the individual feature importance plot for any data point by clicking on one in the main scatter plot or selecting a specific datapoint in the panel wizard on the right.

  • IFI: Shows the top-k important features for an individual prediction. Helps illustrate the local behavior of the underlying model on a specific data point.
  • What-If: Allows changes to feature values of a selected data point to observe resulting changes to prediction value by generating a hypothetical datapoint with the new feature values.
  • ICE: Same as What-If but instead uses a selected real data point.

Individual Predictions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly