test3 Flashcards

https://www.examgo.com/exams/microsoft/dp-100/ (69 cards)

1
Q

You plan to use a Deep Learning Virtual Machine (DLVM) to train deep learning models using Compute Unified Device Architecture (CUDA) computations.

You need to configure the DLVM to support CUDA.

What should you implement?

Intel Software Guard Extensions (Intel SGX) technology
Solid State Drives (SSD)
Graphic Processing Unit (GPU)
Computer Processing Unit (CPU) speed increase by using overcloking
High Random Access Memory (RAM) configuration

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

HOTSPOT

You have a dataset that contains 2,000 rows. You are building a machine learning classification model by using Azure Learning Studio. You add a Partition and Sample module to the experiment.

You need to configure the module.

You must meet the following requirements:

✑ Divide the data into subsets

✑ Assign the rows into folds using a round-robin method

✑ Allow rows in the dataset to be reused

How should you configure the module? To answer, select the appropriate options in the dialog box in the answer area. NOTE: Each correct selection is worth one point.
Partition and Sample
Partition or sample mode
Assign to Folds ▼
Pick Fold
Sampling
Head
☐ Use replacement in the partitioning
☐ Randomized split

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

HOTSPOT

You create an Azure Machine Learning workspace and set up a development environment. You plan to train a deep neural network (DNN) by using the Tensorflow framework and by using estimators to submit training scripts.

You must optimize computation speed for training runs.

You need to choose the appropriate estimator to use as well as the appropriate training compute target configuration.

Which values should you use? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.
Answer Area

Parameter Value

Estimator:
Estimator
SKLearn
PyTorch
Tensorflow
Chainer

Training compute:
12 vCPU, 48 GB memory, 96 GB SSD
12 vCPU, 112 GB memory, 680 GB SSD, 2 GPU, 24 GB GPU memory
16 vCPU, 128 GB memory, 160 GB HDD, 80 GB NVME disk (4000 MBps)
44 vCPU, 352 GB memory, 3.4 GHz CPU frequency all cores

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

You create a Python script that runs a training experiment in Azure Machine Learning. The script uses the Azure Machine Learning SDK for Python.

You must add a statement that retrieves the names of the logs and outputs generated by the script.

You need to reference a Python class object from the SDK for the statement.

Which class object should you use?

Run
ScripcRunConfig
Workspace
Experiment

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

You register a file dataset named csvjolder that references a folder. The folder includes multiple com ma-separated values (CSV) files in an Azure storage blob container. You plan to use the following code to run a script that loads data from the file dataset.

You create and instantiate the following variables:
Variable Description

remote_cluster References the Azure Machine Learning compute cluster
ws References the Azure Machine Learning workspace

You have the following code:
from azureml.train. estimator import Estimator
file_dataset = ws.datasets.get(‘csv_folder’)
estimator = Estimator(source_directory=script_folder,
compute_target = remote_cluster,
entry_script =’script.py’)
run = experiment.submit(config=estimator)

You need to pass the dataset to ensure that the script can read the files it references .

Which code segment should you insert to replace the code comment?

inputs=[file_dataset.as_named_input(‘training_files’).to_pandas_dataframe()],
inputs=[file_dataset.as_named_input(‘training_files’).as_mount()],
script_params={‘–training_files’: file_dataset},
inputs=[file_dataset.as_named_input(‘training_files’)],

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

HOTSPOT

You are using an Azure Machine Learning workspace. You set up an environment for model testing and an environment for production.

The compute target for testing must minimize cost and deployment efforts. The compute target for production must provide fast response time, autoscaling of the deployed service, and support real-time inferencing.

You need to configure compute targets for model testing and production.

Which compute targets should you use? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.
Environment Compute target

Testing:
Local web service
Azure Kubernetes Services (AKS)
Azure Container Instances
Azure Machine Learning compute clusters

Production:
Local web service
Azure Kubernetes Services (AKS)
Azure Container Instances
Azure Machine Learning compute clusters

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

You have a dataset that includes confidential data. You use the dataset to train a model.

You must use a differential privacy parameter to keep the data of individuals safe and private.

You need to reduce the effect of user data on aggregated results.

What should you do?

Decrease the value of the epsilon parameter to reduce the amount of noise added to the data
Increase the value of the epsilon parameter to decrease privacy and increase accuracy
Decrease the value of the epsilon parameter to increase privacy and reduce accuracy
Set the value of the epsilon parameter to 1 to ensure maximum privacy

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

You create a pipeline in designer to train a model that predicts automobile prices.

Because of non-linear relationships in the data, the pipeline calculates the natural log (Ln) of the prices in the training data, trains a model to predict this natural log of price value, and then calculates the exponential of the scored label to get the predicted price.

The training pipeline is shown in the exhibit. (Click the Training pipeline tab.)

Training pipeline
+——————+
| Automobile data |
+——–+———+
|
v
+—————————–+
| Apply Math Operation |
| Replace price with Ln(price)|
+————–+————–+
|
v
+————-+
| Split Data |
|70% train / |
|30% validate |
+—–+——-+
|
+——+——+
| |
v v
+——————+ |
| Train Model | |
| Predict Ln(price)| |
+———+———- |
| |
v |
+——————+ |
| Linear Regression| |
+——————+ |
|
|
v
+————+
|Score Model |
| Get Ln(price) prediction |
+————+
|
v
+———————————-+
| Apply Math Operation |
| Replace Scored Labels with |
| Exp(Scored Labels) |
+—————-+—————–+
|
v
+————————————-+
| Apply SQL Transformation |
| SELECT [Scored Labels] AS |
| predicted_price |
+————————————-+
You create a real-time inference pipeline from the training pipeline, as shown in the exhibit. (Click the Real-time pipeline tab.)

Real-time pipeline
+——————+ +——————+
| Web Service Input| | Automobile data |
+——–+———+ +——–+———+
\ /
\ /
v v
+——————————+
| Apply Math Operation |
| Replace price with Ln(price)|
+—————+————–+
|
v
+——————————+
| MD-Automobile_Price_Regress…|
+—————+————–+
|
v
+——————+
| Score Model |
| Get Ln(price) |
| prediction |
+—–+——+—–+
| |
| v
| +———————+
| | Web Service Output |
| +———————+
|
v
+————————————+
| Apply Math Operation |
| Replace Scored Labels with |
| Exp(Scored Labels) |
+—————-+——————-+
|
v
+——————————————+
| Apply SQL Transformation |
| SELECT [Scored Labels] AS predicted_price|
+——————————————+

You need to modify the inference pipeline to ensure that the web service returns the exponential of the scored label as the predicted automobile price and that client applications are not required to include a price value in the input values.

Which three modifications must you make to the inference pipeline? Each correct answer presents part of the solution. NOTE: Each correct selection is worth one point.

Connect the output of the Apply SQL Transformation to the Web Service Output module.
Replace the Web Service Input module with a data input that does not include the price column.
Add a Select Columns module before the Score Model module to select all columns other than price.
Replace the training dataset module with a data input that does not include the price column.
Remove the Apply Math Operation module that replaces price with its natural log from the data flow.
Remove the Apply SQL Transformation module from the data flow.

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

You plan to use automated machine learning to train a regression model. You have data that has features which have missing values, and categorical features with few distinct values.

You need to configure automated machine learning to automatically impute missing values and encode categorical features as part of the training task.

Which parameter and value pair should you use in the AutoMLConfig class?

featurization = ‘auto’
enable_voting_ensemble = True
task = ‘classification’
exclude_nan_labels = True
enable_tf = True

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You are creating a new experiment in Azure Machine Learning Studio.

One class has a much smaller number of observations than the other classes in the training set.

You need to select an appropriate data sampling strategy to compensate for the class imbalance.

Solution: You use the Stratified split for the sampling mode.

Does the solution meet the goal?

Yes
No

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

HOTSPOT

You use Azure Machine Learning to train and register a model.

You must deploy the model into production as a real-time web service to an inference cluster named service-compute that the IT department has created in the Azure Machine Learning workspace.

Client applications consuming the deployed web service must be authenticated based on their Azure Active Directory service principal.

You need to write a script that uses the Azure Machine Learning SDK to deploy the model.

The necessary modules have been imported.

How should you complete the code? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.

Assume the necessary modules have been imported
deploy_target = __________▼(ws, “service-compute”)
AksCompute
AmlCompute
RemoteCompute
BatchCompute

deployment_config = __________▼.deploy_configuration(cpu_cores=1, memory_gb=1,
AksWebservice
AciWebservice
LocalWebService
_____________________________▼)
token_auth_enabled=True
token_auth_enabled=False
auth_enabled=True
auth_enabled=False

service = Model.deploy(ws, “ml-service”,
[model], inference_config, deployment_config, deploy_target)
service.wait_for_deployment(show_output = True)

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

HOTSPOT

You are hired as a data scientist at a winery. The previous data scientist used Azure Machine Learning.

You need to review the models and explain how each model makes decisions.

Which explainer modules should you use? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.

Answer Area
Model type Explainer

A random forest model for predicting the alcohol
content in wine given a set of covariates:
Tabular
HAN
Text
Image

A natural language processing model for
analyzing field reports:
Tree
HAN
Text
Image

An image classifier that determines the quality of
the grape based upon its physical characteristics:
Kernel
HAN
Text
Image

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

You use the Azure Machine Learning Python SDK to define a pipeline to train a model.

The data used to train the model is read from a folder in a datastore.

You need to ensure the pipeline runs automatically whenever the data in the folder changes.

What should you do?

Set the regenerate_outputs property of the pipeline to True
Create a ScheduleRecurrance object with a Frequency of auto. Use the object to create a Schedule for the pipeline
Create a PipelineParameter with a default value that references the location where the training data is stored
Create a Schedule for the pipeline. Specify the datastore in the datastore property, and the folder containing the training data in the path_on_datascore property

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

HOTSPOT

You need to configure the Edit Metadata module so that the structure of the datasets match.

Which configuration options should you select? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.
Answer Area
Properties

Project
◿ Edit Metadata
Column
Selected columns:
Column names: Median Value
Launch column selector

Floating point
DateTime
TimeSpan
Integer

Unchanged
Make Categorical
Make Uncategorical
Fields
5

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

You run a script as an experiment in Azure Machine Learning.

You have a Run object named run that references the experiment run. You must review the log files that were generated during the experiment run.

You need to download the log files to a local folder for review.

Which two code segments can you run to achieve this goal? Each correct answer presents a complete solution. NOTE: Each correct selection is worth one point.

run.get_details()
run.get_file_names()
run.get_metrics()
run.download_files(output_directory=’./runfiles’)
run.get_all_logs(destination=’./runlogs’)

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

You are conducting feature engineering to prepuce data for further analysis.

The data includes seasonal patterns on inventory requirements.

You need to select the appropriate method to conduct feature engineering on the data.

Which method should you use?

Exponential Smoothing (ETS) function.
One Class Support Vector Machine module
Time Series Anomaly Detection module
Finite Impulse Response (FIR) Filter module.

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

You register the following versions of a model.

Model name Model version Tags Properties
healthcare_model 3 ‘Training context’:’CPU Compute’ value: 87.43
healthcare_model 2 ‘Training context’:’CPU Compute’ value: 54.98
healthcare_model 1 ‘Training context’:’CPU Compute’ value: 23.56
You use the Azure ML Python SDK to run a training experiment. You use a variable named run to reference the experiment run.

After the run has been submitted and completed, you run the following code:
run.register_model(model_path=’outputs/model.pkl’,
model_name=’healthcare_model’,
tags={‘Training context’:’CPU Compute’} )
For each of the following statements, select Yes if the statement is true. Otherwise, select No. NOTE: Each correct selection is worth
one point.
Answer Area
The code will cause a previous version of the saved model to be overwritten.
The version number will now be 4.
The latest version of the stored model will have a property of value: 87.43.

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

You develop and train a machine learning model to predict fraudulent transactions for a hotel booking website.

Traffic to the site varies considerably. The site experiences heavy traffic on Monday and Friday and much lower traffic on other days. Holidays are also high web traffic days. You need to deploy the model as an Azure Machine Learning real-time web service endpoint on compute that can dynamically scale up and down to support demand .

Which deployment compute option should you use?

attached Azure Databricks cluster
Azure Container Instance (ACI)
Azure Kubernetes Service (AKS) inference cluster
Azure Machine Learning Compute Instance
attached virtual machine in a different region

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

You plan to create a speech recognition deep learning model.

The model must support the latest version of Python.

You need to recommend a deep learning framework for speech recognition to include in the Data Science Virtual Machine (DSVM).

What should you recommend?

Apache Drill
Tensorflow
Rattle
Weka

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

HOTSPOT

You are analyzing the asymmetry in a statistical distribution.

The following image contains two density curves that show the probability distribution of two datasets.

The image consists of two side-by-side graphs, each showing: A solid curve – represents the true probability distribution A dashed curve – represents a model’s predicted distribution Graph 1 The solid line: Starts from the bottom left Rises steadily to a peak towards the middle-right Then gently declines to the right Shape: Left-skewed or unimodal with right tail The dashed line:
A smaller, narrower peak than the solid line Slightly right-shifted compared to the solid line Indicates: The model underestimates the earlier values and shifts the peak Graph 2 The solid line: Rises sharply near the beginning (left side) Peaks early
Gradually declines toward the right Shape: Right-skewed with a longer right tail The dashed line:
A smaller and more compact peak Also positioned toward the left but not matching the solid curve closely Indicates: Model underestimates the longer tail and over-focuses on the peak

Use the drop-down menus to select the answer choice that answers each question based on the information presented in the graphic. NOTE: Each correct selection is worth one point.
Answer Area

Question Answer choice

Which type of distribution is shown for the
dataset density curve of Graph 1?
Negative skew
Positive skew
Normal distribution
Bimodal distribution

Which type of distribution is shown for the
dataset density curve of Graph 2?
Negative skew
Positive skew
Normal distribution
Bimodal distribution

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a
unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while
others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in
the review screen.
You are creating a model to predict the price of a student’s artwork depending on the following variables: the student’s length
of education, degree type, and art form.
You start by creating a linear regression model.
You need to evaluate the linear regression model.
Solution: Use the following metrics: Relative Squared Error, Coefficient of Determination, Accuracy, Precision, Recall, F1
score, and AUC.
Does the solution meet the goal?

A. Yes
B. No

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a
unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while
others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in
the review screen.
You are creating a model to predict the price of a student’s artwork depending on the following variables: the student’s length
of education, degree type, and art form.
You start by creating a linear regression model.
You need to evaluate the linear regression model.
Solution: Use the following metrics: Accuracy, Precision, Recall, F1 score, and AUC.
Does the solution meet the goal?

A. Yes
B. No

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

DRAG DROP
You are producing a multiple linear regression model in Azure Machine Learning Studio.
Several independent variables are highly correlated.
You need to select appropriate methods for conducting effective feature engineering on all the data.
Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the
answer area and arrange them in the correct order.
Select and Place:
Action
Evaluate the probability function
Remove duplicate rows
Use the Filter Based Feature Selection module
Test the hypothesis using t-Test
Compute linear correlation
Build a counting transform

Answer area

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

HOTSPOT
You are performing feature scaling by using the scikit-learn Python library for x.1 x2, and x3 features.
Original and scaled data is shown in the following image.

Use the drop-down menus to select the answer choice that answers each question based on the information presented in
the graphic.
NOTE: Each correct selection is worth one point.
Hot Area:
Answer Area
Question
Which scaler is used in graph A?
Standard Scaler
Min Max Scale
Normalizer
Which scaler is used in graph B?
Standard Scaler
Min Max Scale
Normalizer
Which scaler is used in graph C?
Standard Scaler
Min Max Scale
Normalizer

Answer choice

A

Which scaler is used in graph A?
Standard Scaler

Which scaler is used in graph B?
Min Max Scale

Which scaler is used in graph C?
Normalizer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution. After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen. You train a classification model by using a logistic regression algorithm. You must be able to explain the model's predictions by calculating the importance of each feature, both as an overall global relative importance value and as a measure of local importance for a specific set of predictions. You need to create an explainer that you can use to retrieve the required global and local feature importance values. Solution: Create a PFIExplainer. Does the solution meet the goal? A. Yes B. No
26
Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution. After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen. You train a classification model by using a logistic regression algorithm. You must be able to explain the model's predictions by calculating the importance of each feature, both as an overall global relative importance value and as a measure of local importance for a specific set of predictions. You need to create an explainer that you can use to retrieve the required global and local feature importance values. Solution: Create a TabularExplainer. Does the solution meet the goal? A. Yes B. No
27
Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution. After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen. You train a classification model by using a logistic regression algorithm. You must be able to explain the model's predictions by calculating the importance of each feature, both as an overall global relative importance value and as a measure of local importance for a specific set of predictions. You need to create an explainer that you can use to retrieve the required global and local feature importance values. Solution: Create a MimicExplainer. Does the solution meet the goal? A. Yes B. No
28
HOTSPOT You write code to retrieve an experiment that is run from your Azure Machine Learning workspace. The run used the model interpretation support in Azure Machine Learning to generate and upload a model explanation. Business managers in your organization want to see the importance of the features in the model. You need to print out the model features and their relative importance in an output that looks similar to the following. Feature Importance 0 1.5627435610083558 2 0.6077689312583112 4 0.5574002432900718 3 0.42858759955671777 1 0.3501361539771977 How should you complete the code? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point. Answer Area Assume required modules are imported ws = Workspace.from_config() feature_importances = explanation.____________________________( workspace = ws, experiment_name='train_and_explain', run_id='train_and_explain_12345') explanation = client.____________________________() feature_importances = explanation.____________________________() for key, value in feature_importances.items(): print(key, "\t", value) : from_run list_model_explanations from_run_id download_model_explanation : upload_model_explanation list_model_explanations run download_model_explanation : explanation explanation_client get_feature_important_dict download_model_explanation
29
You use the Azure Machine Learning designer to create and run a training pipeline. The pipeline must be run every night to inference predictions from a large volume of files. The folder where the files will be stored is defined as a dataset. You need to publish the pipeline as a REST service that can be used for the nightly inferencing run. What should you do? A. Create a batch inference pipeline B. Set the compute target for the pipeline to an inference cluster C. Create a real-time inference pipeline D. Clone the pipeline
30
DRAG DROP You create an Azure Machine Learning workspace and a new Azure DevOps organization. You register a model in the workspace and deploy the model to the target environment. All new versions of the model registered in the workspace must automatically be deployed to the target environment. You need to configure Azure Pipelines to deploy the model. Which four actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order. Select and Place: Select and Place: Actions Create a service connection Create a release pipeline Create a build pipeline Create an Azure DevOps project Install the Machine Learning extension for Azure Pipelines Answer Area
31
DRAG DROP You train and register a model by using the Azure Machine Learning SDK on a local workstation. Python 3.6 and Visual Studio Code are installed on the workstation. When you try to deploy the model into production as an Azure Kubernetes Service (AKS)-based web service, you experience an error in the scoring script that causes deployment to fail. You need to debug the service on the local workstation before deploying the service to production. Which four actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order. Select and Place: Actions Create an AksWebservice deployment configuration for the service and deploy the model to it Install Docker on the workstation Create a LocalWebservice deployment configuration for the service and deploy the model to it Debug and modify the scoring script as necessary. Use the reload() method of the service after each modification Create an AciWebservice deployment configuration for the service and deploy the model to it Answer Area
32
You train and register a machine learning model. You create a batch inference pipeline that uses the model to generate predictions from multiple data files. You must publish the batch inference pipeline as a service that can be scheduled to run every night. You need to select an appropriate compute target for the inference service. Which compute target should you use? A. Azure Machine Learning compute instance B. Azure Machine Learning compute cluster C. Azure Kubernetes Service (AKS)-based inference cluster D. Azure Container Instance (ACI) compute target
33
You are planning to register a trained model in an Azure Machine Learning workspace. You must store additional metadata about the model in a key-value format. You must be able to add new metadata and modify or delete metadata after creation. You need to register the model. Which parameter should you use? A. description B. model_framework C. tags D. properties
34
You create an Azure Machine Learning workspace named ML-workspace. You also create an Azure Databricks workspace named DB-workspace. DB-workspace contains a cluster named DB-cluster. You must use DB-cluster to run experiments from notebooks that you import into DB-workspace. You need to use ML-workspace to track MLflow metrics and artifacts generated by experiments running on DB-cluster. The solution must minimize the need for custom code. What should you do? A. From DB-cluster, configure the Advanced Logging option. B. From DB-workspace, configure the Link Azure ML workspace option. C. From ML-workspace, create an attached compute. D. From ML-workspace, create a compute cluster.
35
You use the Azure Machine Learning designer to create and run a training pipeline. You then create a real-time inference pipeline. You must deploy the real-time inference pipeline as a web service. What must you do before you deploy the real-time inference pipeline? A. Run the real-time inference pipeline. B. Create a batch inference pipeline. C. Clone the training pipeline. D. Create an Azure Machine Learning compute cluster.
36
You use the designer to create a training pipeline for a classification model. The pipeline uses a dataset that includes the features and labels required for model training. You create a real-time inference pipeline from the training pipeline. You observe that the schema for the generated web service input is based on the dataset and includes the label column that the model predicts. Client applications that use the service must not be required to submit this value. You need to modify the inference pipeline to meet the requirement. What should you do? A. Add a Select Columns in Dataset module to the inference pipeline after the dataset and use it to select all columns other than the label. B. Delete the dataset from the training pipeline and recreate the real-time inference pipeline. C. Delete the Web Service Input module from the inference pipeline. D. Replace the dataset in the inference pipeline with an Enter Data Manually module that includes data for the feature columns but not the label column.
36
You develop and train a machine learning model to predict fraudulent transactions for a hotel booking website. Traffic to the site varies considerably. The site experiences heavy traffic on Monday and Friday and much lower traffic on other days. Holidays are also high web traffic days. You need to deploy the model as an Azure Machine Learning real-time web service endpoint on compute that can dynamically scale up and down to support demand. Which deployment compute option should you use? A. attached Azure Databricks cluster B. Azure Container Instance (ACI) C. Azure Kubernetes Service (AKS) inference cluster D. Azure Machine Learning Compute Instance E. attached virtual machine in a different region
37
DRAG DROP You use Azure Machine Learning to deploy a model as a real-time web service. You need to create an entry script for the service that ensures that the model is loaded when the service starts and is used to score new data as it is received. Which functions should you include in the script? To answer, drag the appropriate functions to the correct actions. Each function may be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll to view content. NOTE: Each correct selection is worth one point. Select and Place: Answer Area Functions main () score () run () init () predict () Action Load the model when the service starts. Use the model to score new data. Function:
38
You are a data scientist working for a hotel booking website company. You use the Azure Machine Learning service to train a model that identifies fraudulent transactions. You must deploy the model as an Azure Machine Learning real-time web service using the Model.deploy method in the Azure Machine Learning SDK. The deployed web service must return real-time predictions of fraud based on transaction data input. You need to create the script that is specified as the entry_script parameter for the InferenceConfig class used to deploy the model. What should the entry script do? A. Register the model with appropriate tags and properties. B. Create a Conda environment for the web service compute and install the necessary Python packages. C. Load the model and use it to predict labels from input data. D. Start a node on the inference cluster where the web service is deployed. E. Specify the number of cores and the amount of memory required for the inference compute.
39
You use the Azure Machine Learning SDK to run a training experiment that trains a classification model and calculates its accuracy metric. The model will be retrained each month as new data is available. You must register the model for use in a batch inference pipeline. You need to register the model and ensure that the models created by subsequent retraining experiments are registered only if their accuracy is higher than the currently registered model. What are two possible ways to achieve this goal? Each correct answer presents a complete solution. NOTE: Each correct selection is worth one point. A. Specify a different name for the model each time you register it. B. Register the model with the same name each time regardless of accuracy, and always use the latest version of the model in the batch inferencing pipeline. C. Specify the model framework version when registering the model, and only register subsequent models if this value is higher. D. Specify a property named accuracy with the accuracy metric as a value when registering the model, and only register subsequent models if their accuracy is higher than the accuracy property value of the currently registered model. E. Specify a tag named accuracy with the accuracy metric as a value when registering the model, and only register subsequent models if their accuracy is higher than the accuracy tag value of the currently registered model.
40
You use Azure Machine Learning designer to create a training pipeline for a regression model. You need to prepare the pipeline for deployment as an endpoint that generates predictions asynchronously for a dataset of input data values. What should you do? A. Clone the training pipeline. B. Create a batch inference pipeline from the training pipeline. C. Create a real-time inference pipeline from the training pipeline. D. Replace the dataset in the training pipeline with an Enter Data Manually module.
41
You train a model and register it in your Azure Machine Learning workspace. You are ready to deploy the model as a real- time web service. You deploy the model to an Azure Kubernetes Service (AKS) inference cluster, but the deployment fails because an error occurs when the service runs the entry script that is associated with the model deployment. You need to debug the error by iteratively modifying the code and reloading the service, without requiring a re-deployment of the service for each code update. What should you do? A. Modify the AKS service deployment configuration to enable application insights and re-deploy to AKS. B. Create an Azure Container Instances (ACI) web service deployment configuration and deploy the model on ACI. C. Add a breakpoint to the first line of the entry script and redeploy the service to AKS. D. Create a local web service deployment configuration and deploy the model to a local Docker container. E. Register a new version of the model and update the entry script to load the new version of the model from its registered path.
42
You use the Azure Machine Learning Python SDK to define a pipeline that consists of multiple steps. When you run the pipeline, you observe that some steps do not run. The cached output from a previous run is used instead. You need to ensure that every step in the pipeline is run, even if the parameters and contents of the source directory have not changed since the previous run. What are two possible ways to achieve this goal? Each correct answer presents a complete solution. NOTE: Each correct selection is worth one point. A. Use a PipelineData object that references a datastore other than the default datastore. B. Set the regenerate_outputs property of the pipeline to True. C. Set the allow_reuse property of each step in the pipeline to False. D. Restart the compute cluster where the pipeline experiment is configured to run. E. Set the outputs property of each step in the pipeline to True.
43
You use the following Python code in a notebook to deploy a model as a web service: from azureml.core.webservice import AciWebservice from azureml.core.model import InferenceConfig inference_config = InferenceConfig(runtime='python', source_directory='model_files', entry_script='score.py', conda_file='env.yml') deployment_config = AciWebservice.deploy_configuration(cpu_cores=1, memory_gb=1) service = Model.deploy(ws, 'my- service', [model], inference_config, deployment_config) service.wait_for_deployment(True) The deployment fails. You need to use the Python SDK in the notebook to determine the events that occurred during service deployment an initialization. Which code segment should you use? A. service.state B. service.get_logs() C. service.serialize() D. service.environment
44
Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution. After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen. You train and register a machine learning model. You plan to deploy the model as a real-time web service. Applications must use key-based authentication to use the model. You need to deploy the web service. Solution: Create an AciWebservice instance. Set the value of the auth_enabled property to False. Set the value of the token_auth_enabled property to True. Deploy the model to the service. Does the solution meet the goal? A. Yes B. No
44
An organization creates and deploys a multi-class image classification deep learning model that uses a set of labeled photographs. The software engineering team reports there is a heavy inferencing load for the prediction web services during the summer. The production web service for the model fails to meet demand despite having a fully-utilized compute cluster where the web service is deployed. You need to improve performance of the image classification web service with minimal downtime and minimal administrative effort. What should you advise the IT Operations team to do? A. Create a new compute cluster by using larger VM sizes for the nodes, redeploy the web service to that cluster, and update the DNS registration for the service endpoint to point to the new cluster. B. Increase the node count of the compute cluster where the web service is deployed. C. Increase the minimum node count of the compute cluster where the web service is deployed. D. Increase the VM size of nodes in the compute cluster where the web service is deployed.
45
You deploy a real-time inference service for a trained model. The deployed model supports a business-critical application, and it is important to be able to monitor the data submitted to the web service and the predictions the data generates. You need to implement a monitoring solution for the deployed model using minimal administrative effort. What should you do? A. View the explanations for the registered model in Azure ML studio. B. Enable Azure Application Insights for the service endpoint and view logged data in the Azure portal. C. View the log files generated by the experiment used to train the model. D. Create an ML Flow tracking URI that references the endpoint, and view the data logged by ML Flow.
46
You train a machine learning model. You must deploy the model as a real-time inference service for testing. The service requires low CPU utilization and less than 48 MB of RAM. The compute target for the deployed service must initialize automatically while minimizing cost and administrative overhead. Which compute target should you use? A. Azure Container Instance (ACI) B. attached Azure Databricks cluster C. Azure Kubernetes Service (AKS) inference cluster D. Azure Machine Learning compute cluster
47
You create a multi-class image classification deep learning model. You train the model by using PyTorch version 1.2. You need to ensure that the correct version of PyTorch can be identified for the inferencing environment when the model is deployed. What should you do? A. Save the model locally as a.pt file, and deploy the model as a local web service. B. Deploy the model on computer that is configured to use the default Azure Machine Learning conda environment. C. Register the model with a .pt file extension and the default version property. D. Register the model, specifying the model_framework and model_framework_version properties.
48
You create a batch inference pipeline by using the Azure ML SDK. You configure the pipeline parameters by executing the following code: from azureml.contrib.pipeline.steps import ParallelRunConfig parallel_run_config = ParallelRunConfig( source_directory=scripts_folder, entry_script= "batch_pipeline.py", mini_batch_size= "5", error_threshold=10, output_action= "append_row", environment=batch_env, compute_target=compute_target, logging_level= "DEBUG", node_count=4) You need to obtain the output from the pipeline execution. Where will you find the output? A. the digit_identification.py script B. the debug log C. the Activity Log in the Azure portal for the Machine Learning workspace D. the Inference Clusters tab in Machine Learning studio E. a file named parallel_run_step.txt located in the output folder
E. a file named parallel_run_step.txt located in the output folder
49
You create a deep learning model for image recognition on Azure Machine Learning service using GPU-based training. You must deploy the model to a context that allows for real-time GPU-based inferencing. You need to configure compute resources for model inferencing. Which compute type should you use? A. Azure Container Instance B. Azure Kubernetes Service C. Field Programmable Gate Array D. Machine Learning Compute
50
You use the following code to define the steps for a pipeline: from azureml.core import Workspace, Experiment, Run from azureml.pipeline.core import Pipeline from azureml.pipeline.steps import PythonScriptStep ws = Workspace.from_config() step1 = PythonScriptStep(name="step1", ... ) step2 = PythonScriptsStep(name="step2", ... ) pipeline_steps = [step1, step2] You need to add code to run the steps. Which two code segments can you use to achieve this goal? Each correct answer presents a complete solution. NOTE: Each correct selection is worth one point. A. experiment = Experiment(workspace=ws, name='pipeline-experiment') run = experiment.submit(config=pipeline_steps) B. run = Run(pipeline_steps) C. pipeline = Pipeline(workspace=ws, steps=pipeline_steps) experiment = Experiment(workspace=ws, name='pipeline-experiment') run = experiment.submit(pipeline) D. pipeline = Pipeline(workspace=ws, steps=pipeline_steps) run = pipeline.submit(experiment_name='pipeline-experiment')
C. pipeline = Pipeline(workspace=ws, steps=pipeline_steps) experiment = Experiment(workspace=ws, name='pipeline-experiment') run = experiment.submit(pipeline) D. pipeline = Pipeline(workspace=ws, steps=pipeline_steps) run = pipeline.submit(experiment_name='pipeline-experiment')
50
Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution. After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen. You plan to use a Python script to run an Azure Machine Learning experiment. The script creates a reference to the experiment run context, loads data from a file, identifies the set of unique values for the label column, and completes the experiment run: data = pd.read_csv('data.csv') label_vals = data['label' ].unique() # Add code to record metrics here run.complete() from azureml.core import Run import pandas as pd run = Run.get_context() The experiment must record the unique labels in the data as metrics for the run that can be reviewed later. You must add code to the script to record the unique label values as run metrics at the point indicated by the comment. Solution: Replace the comment with the following code: run.log_list('Label Values', label_vals) Does the solution meet the goal? A. Yes B. No
51
HOTSPOT You create a Python script named train.py and save it in a folder named scripts. The script uses the scikit-learn framework to train a machine learning model. You must run the script as an Azure Machine Learning experiment on your local workstation. You need to write Python code to initiate an experiment that runs the train.py script. How should you complete the code segment? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point. Hot Area: from azureml.core import Experiment, ScriptRunConfig, Environment from azureml.core.conda_dependencies import CondaDependencies from azureml.core import Workspace ws = Workspace.from_config() py_sk = Environment('sklearn-training') pkgs = CondaDependencies.create(pip_packages=['scikit-learn', 'azureml-defaults']) py_sk.python.conda_dependencies = pkgs script_config = ScriptRunConfig( ____________________________ = 'scripts', ____________________________ = 'train.py', ____________________________ = py_sk ) experiment = Experiment(workspace=ws, name='training-experiment') run = experiment.submit(config=script_config) Choose the correct keywords for each blank from the options below: script source_directory environment compute_target resume_from arguments
51
HOTSPOT You use an Azure Machine Learning workspace. You create the following Python code: from azureml.core import ScriptRunConfig src = ScriptRunConfig(source_directory=project_folder, script='train.py' environment=myenv) For each of the following statements, select Yes if the statement is true. Otherwise, select No. NOTE: Each correct selection is worth one point. Hot Area: Answer Area Statements The default environment will be created The training script will run on local compute A script run configuration runs a training script named train. py located in a directory defined by the project_folder variable
52
You have a Jupyter Notebook that contains Python code that is used to train a model. You must create a Python script for the production deployment. The solution must minimize code maintenance. Which two actions should you perform? Each correct answer presents part of the solution. NOTE: Each correct selection is worth one point. A. Refactor the Jupyter Notebook code into functions B. Save each function to a separate Python file C. Define a main() function in the Python script D. Remove all comments and functions from the Python script
53
DRAG DROP You previously deployed a model that was trained using a tabular dataset named training-dataset, which is based on a folder of CSV files. Over time, you have collected the features and predicted labels generated by the model in a folder containing a CSV file for each month. You have created two tabular datasets based on the folder containing the inference data: one named predictions-dataset with a schema that matches the training data exactly, including the predicted label; and another named features-dataset with a schema containing all of the feature columns and a timestamp column based on the filename, which includes the day, month, and year. You need to create a data drift monitor to identify any changing trends in the feature data since the model was trained. To accomplish this, you must define the required datasets for the data drift monitor. Which datasets should you use to configure the data drift monitor? To answer, drag the appropriate datasets to the correct data drift monitor options. Each source may be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll to view content. NOTE: Each correct selection is worth one point. Select and Place: Target datasets: training-dataset predictions-dataset features-dataset Answer Area Baseline dataset: Target dataset:
54
You plan to run a Python script as an Azure Machine Learning experiment. The script contains the following code: import os, argparse, glob from azureml.core import Run parser = argparse. ArgumentParser() parser.add_argument(' -- input-data', type=str, dest='data_folder') args = parser.parse_args() data_path = args.data_folder file_paths= glob.glob(data_path +"/ *. jpg") You must specify a file dataset as an input to the script. The dataset consists of multiple large image files and must be streamed directly from its source. You need to write code to define a ScriptRunConfig object for the experiment and pass the ds dataset as an argument. Which code segment should you use? A. arguments = [' -- input-data', ds.to_pandas_dataframe()] B. arguments = [' -- input-data', ds.as_mount()] C. arguments = [' -- data-data', ds] D. arguments = [' -- input-data', ds.as_download()]
55
You create and register a model in an Azure Machine Learning workspace. You must use the Azure Machine Learning SDK to implement a batch inference pipeline that uses a ParallelRunStep to score input data using the model. You must specify a value for the ParallelRunConfig compute_target setting of the pipeline step. You need to create the compute target. Which class should you use? A. BatchCompute B. AdlaCompute C. AmlCompute D. AksCompute
56
HOTSPOT You create an Azure Databricks workspace and a linked Azure Machine Learning workspace. You have the following Python code segment in the Azure Machine Learning workspace: import mlflow import mlflow.azureml import azureml.mlflow import azureml.core from azureml.core import Workspace subscription_id = 'subscription_id' resourse_group = 'resource_group_name' workspace_name = 'workspace_name' ws = Workspace.get(name=workspace_name, subscription_id=subscription_id, resource_group=resource_group) experimentName = "/Users/{user_name}/{experiment_folder}/{experiment_name}" mlflow.set_experiment(experimentName) uri = ws.get_mlflow_tracking_uri() mlflow.set_tracking_uri(uri) Instructions: For each of the following statements, select Yes if the statement is true. Otherwise, select No. NOTE: Each correct selection is worth one point. Hot Area: A resource group and Azure Machine Learning workspace will be created. An Azure Databricks experiment will be tracked only in the Azure Machine Learning workspace. The epoch loss metric is set to be tracked.
57
You use the following code to define the steps for a pipeline: from azureml.core import Workspace, Experiment, Run from azureml.pipeline.core import Pipeline from azureml.pipeline.steps import PythonScriptStep ws = Workspace.from_config() . . . step1 = PythonScriptStep(name="step1", ...) step2 = PythonScriptsStep(name="step2", ...) pipeline_steps = [step1, step2] You need to add code to run the steps. Which two code segments can you use to achieve this goal? Each correct answer presents a complete solution. NOTE: Each correct selection is worth one point. A. experiment = Experiment(workspace=ws, name='pipeline-experiment') run = experiment.submit(config=pipeline_steps) B. run = Run(pipeline_steps) C. pipeline = Pipeline(workspace=ws, steps=pipeline_steps) experiment = Experiment(workspace=ws, name='pipeline-experiment') run = experiment.submit(pipeline) D. pipeline = Pipeline(workspace=ws, steps=pipeline_steps) run = pipeline.submit(experiment_name='pipeline-experiment')
58
You have the following code. The code prepares an experiment to run a script: from azureml.core import Workspace, Experiment, Run, ScriptRunConfig ws = Workspace.from_config() script_config = ScriptRunConfig(source_directory='experiment_files', script='experiment.py') script_experiment = Experiment(workspace=ws, name='script-experiment') The experiment must be run on local computer using the default environment. You need to add code to start the experiment and run the script. Which code segment should you use? A. run = script_experiment.start_logging() B. run = Run(experiment=script_experiment) C. ws.get_run(run_id=experiment.id) D. run = script_experiment.submit(config=script_config)
59
You run a script as an experiment in Azure Machine Learning. You have a Run object named run that references the experiment run. You must review the log files that were generated during the experiment run. You need to download the log files to a local folder for review. Which two code segments can you run to achieve this goal? Each correct answer presents a complete solution. NOTE: Each correct selection is worth one point. A. run.get_details() B. run.get_file_names() C. run.get_metrics() D. run.download_files(output_directory='./runfiles') E. run.get_all_logs(destination='./runlogs')
60
You create a Python script that runs a training experiment in Azure Machine Learning. The script uses the Azure Machine Learning SDK for Python. You must add a statement that retrieves the names of the logs and outputs generated by the script. You need to reference a Python class object from the SDK for the statement. Which class object should you use? A. Run B. ScriptRunConfig C. Workspace D. Experiment
61
You plan to run a Python script as an Azure Machine Learning experiment. The script must read files from a hierarchy of folders. The files will be passed to the script as a dataset argument. You must specify an appropriate mode for the dataset argument. Which two modes can you use? Each correct answer presents a complete solution. NOTE: Each correct selection is worth one point. A. to_pandas_dataframe() B. as_download() C. as_upload() D. as_mount()
62
You use the Azure Machine Learning Python SDK to define a pipeline to train a model. The data used to train the model is read from a folder in a datastore. You need to ensure the pipeline runs automatically whenever the data in the folder changes. What should you do? A. Set the regenerate_outputs property of the pipeline to True B. Create a ScheduleRecurrance object with a Frequency of auto. Use the object to create a Schedule for the pipeline C. Create a PipelineParameter with a default value that references the location where the training data is stored D. Create a Schedule for the pipeline. Specify the datastore in the datastore property, and the folder containing the training data in the path_on_datastore property
63
You create a binary classification model. You need to evaluate the model performance. Which two metrics can you use? Each correct answer presents a complete solution. NOTE: Each correct selection is worth one point. A. relative absolute error B. precision C. accuracy D. mean absolute error E. coefficient of determination
64
HOTSPOT You are using C-Support Vector classification to do a multi-class classification with an unbalanced training dataset. The C- Support Vector classification using Python code shown below: from sklearn.svm import svc import numpy as np SVC = SVC(kernel= 'linear", class weight= 'balanced', C-1.0, random_state-0) model1 = svc.fit(X train, y) You need to evaluate the C-Support Vector classification code. Which evaluation statement should you use? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point. Hot Area: Answer Area Code Segment Evaluation Statement class_weight=balanced: Automatically select the performance metrics for the classification. Automatically adjust weights directly proportional to class frequencies in the input data. Automatically adjust weights inversely proportional to class frequences in the input data. C parameter: Penalty parameter Degreee of polynomial kernel function Size of the kernel cache
65
You are performing clustering by using the K-means algorithm. You need to define the possible termination conditions. Which three conditions can you use? Each correct answer presents a complete solution. NOTE: Each correct selection is worth one point. A. Centroids do not change between iterations. B. The residual sum of squares (RSS) rises above a threshold. C. The residual sum of squares (RSS) falls below a threshold. D. A fixed number of iterations is executed. E. The sum of distances between centroids reaches a maximum.