Machine Learning with Viya® 3.4® Lesson 6: Model Assessment and Deployment Flashcards by Nicole Fox

How would you add a challenger model to a pipeline comparison in Model Studio?

Select the model in its pipeline.

How well did you know this?

Not at all

Perfectly

Which assessment measure should be used to determine the champion model for predicting an interval target?

Average Square Error

How well did you know this?

Not at all

Perfectly

How would you identify which model has the best classification accuracy using C-statistic values?

The model with highest C-statistic value has the best performance.

How well did you know this?

Not at all

Perfectly

Which dataset partition will be used to select the champion model when using the default settings in Model Studio?

Validation

How well did you know this?

Not at all

Perfectly

What type of data is used during champion-challenger testing to compare the performance of the currently deployed model and a challenger model during the model deployment phase?

Champion-challenger testing compares performance on historic data during model deployment.

How well did you know this?

Not at all

Perfectly

What are the primary considerations for choosing an appropriate model selection statistic?

business needs
the prediction type
the measurement level

How well did you know this?

Not at all

Perfectly

The confusion matrix is the foundation of which assessment plot?

the ROC chart

How well did you know this?

Not at all

Perfectly

A confusion matrix helps you classify which type of target?

Binary

How well did you know this?

Not at all

Perfectly

Which validation method would you recommend for a small dataset?

Cross-validation

How well did you know this?

Not at all

Perfectly

A cumulative lift chart shows that a machine learning model has a lift of 2.6 at a depth of 10%. What does this mean?

For the top 10% of cases, the machine learning model captures 2.6 times more primary outcome cases than a random model.

How well did you know this?

Not at all

Perfectly

What model fit statistics are recommended for a decision prediction?

accuracy or misclassification

How well did you know this?

Not at all

Perfectly

What are two commonly used performance statistics for estimate predictions?

Schwarz’s Bayesian Criterion (SBC)

Weighted Square Error?

How well did you know this?

Not at all

Perfectly

What assessment measure would you use to assess the probability of a customer responding to a targeted ad campaign?

the Gains chart

How well did you know this?

Not at all

Perfectly

What is another term for gains chart?

Cumulative Percentile Hits chart

How well did you know this?

Not at all

Perfectly

What is another term for a Cumulative Lift chart?

a Gains chart

How well did you know this?

Not at all

Perfectly

How well did you know this?

Not at all

Perfectly

What is a cumulative lift chart?

Study These Flashcards

A lift chart indicates how well the model did as compared to no model. The lift is the ratio between the result predicted by the model and the result using no model.

cumulative lift plotted as a percent on the vertical axis

What’s the calculation for the lift for a given percentile when evaluating a Cumulative Captured Response (Gains) Chart?

Study These Flashcards

Divide the Model Response Rate by the Random Response Rate:

Lift = (P,M) = CPH(P,M) / P

where P is a given percentile

What are two things you can investigate in an ICE plot?

Study These Flashcards

Subgroups and interactions among model variables.

What do level differences in an ICE plot suggest?

Study These Flashcards

group effects

What does an intersecting slope in an ICE plot indicate?

Study These Flashcards

Interactions between the plot variable and one or more additional model variables

What are two things to look for in an ICE plot?

Study These Flashcards

intersecting slopes
level differences

Which machine learning model is the easiest to interpret?

Study These Flashcards

Decision trees are highly interpretable because they are based on English rules, which are rules that use Boolean logic.

Which dataset partition assists in comparing possible models?

Study These Flashcards

Validation

Which dataset partition generates the possible models?

Training data

Which dataset partition should be used to assess how the final model generalizes to new data?

Test data

What does a PD plot show you?

How model inputs affect the model's performance

What does publishing a model do?

Publishing a model is used to makes score available in CAS, Teradata or Hadoop.

You want to predict the rankings of a target variable and have built multiple models. Which selection statistic should be used to compare your models?

a ROC Index or Gini Coefficient

Which model fit statistics are commonly used for ranking predictions?

ROC Index Gini Coefficient

How do you register a model in Model Studio?

Select Register Models from the Project Pipeline menu on the Pipeline Comparison tab.

What does registering a model do?

Registering the model in Model Studio makes the model available in Model Manager.

Which of the following statements is true about an ROC chart? a. The selection value of each point is displayed on the chart. b. Each point on the chart corresponds to a specific fraction of the sorted data. c. True positives are on the x axis. d. For a perfect model, the ROC curve is a straight line from the bottom left corner to the top right corner of the plot.

In an ROC chart, each point corresponds to a specific fraction of the sorted data.

What kind of model can you import using a Score Code Import node?

A Score Code import node can only import either an ASTORE model or a single DATA step file model.

What must you do before you can score and manage a model in Model Studio?

In model comparison, the best model has the highest value of which measure?

Sensitivity

What is the true positive rate referred to as?

Sensitivity

What is sensitivity?

The true positive rate

Calculate Sensitivity:

divide the true positive decisions by the total number of known primary cases Specificity = TruePositive / TruePositive + FalsePositive

What is the true negative rate referred to as?

Specificity

Calculate Specificity:

divide the true negative decisions by the total number of known secondary cases Specificity = TrueNegative / TrueNegative + FalsePositive

Machine Learning with Viya® 3.4® Lesson 6: Model Assessment and Deployment Flashcards

(41 cards)