test1 Flashcards
https://freedumps.certqueen.com/?s=DP-100 (69 cards)
You are developing a hands-on workshop to introduce Docker for Windows to attendees.You need to ensure that workshop attendees can install Docker on their devices.Which two prerequisite components should attendees install on the devices? Each correct answer pre-sents part of the solution.NOTE: Each correct selection is worth one point.
Microsoft Hardware-Assisted Virtualization Detection Tool
Kitematic
BIOS-enabled virtualization
VirtualBox
Windows 10 64-bit Professional
You are implementing a machine learning model to predict stock prices. The model uses a PostgreSQL database and requires GPU processing. You need to create a virtual machine that is pre-configured with the required tools. What should you do?
Create a Data Science Virtual Machine (DSVM) Windows edition.
Create a Geo Al Data Science Virtual Machine (Geo-DSVM) Windows edition.
Create a Deep Learning Virtual Machine (DLVM) Linux edition.
Create a Deep Learning Virtual Machine (DLVM) Windows edition.
Create a Data Science Virtual Machine (DSVM) Linux edition.
You must store data in Azure Blob Storage to support Azure Machine Learning.You need to transfer the data into Azure Blob Storage. What are three possible ways to achieve the goal? Each correct answer presents a complete solution.NOTE: Each correct selection is worth one point.
Bulk Insert SQL Query
AzCopy
Python script
Azure Storage Explorer
Bulk Copy Program (BCP)
You are moving a large dataset from Azure Machine Learning Studio to a Weka environment.You need to format the data for the Weka environment.Which module should you use?
Convert to CSV
Convert to Dataset
Convert to ARFF
Convert to SVMLight
You are solving a classification task.You must evaluate your model on a limited data sample by using k-fold cross validation. You start by configuring a k parameter as the number of splits. You need to configure the k parameter for the cross-validation. Which value should you use?
k=0.5
k=0
k=5
k=1
You are creating a machine learning model. You have a dataset that contains null rows.You need to use the Clean Missing Data module in Azure Machine Learning Studio to identify and re-solve the null and missing data in the dataset. Which parameter should you use?
Replace with mean
Remove entire column
Remove entire row
Hot Deck
You are performing feature engineering on a dataset. You must add a feature named CityName and populate the column value with the text London.You need to add the new feature to the dataset.Which Azure Machine Learning Studio module should you use?
Edit Metadata
Preprocess Text
Execute Python Script
Latent Dirichlet Allocation
You are creating a binary classification by using a two-class logistic regression model. You need to evaluate the model results for imbalance.Which evaluation metric should you use?
A. Relative Absolute Error
B. AUC Curve
C. Mean Absolute Error
D. Relative Squared Error
E. Accuracy
F. Root Mean Square Error
You are building a machine learning model for translating English language textual content into French language textual content. You need to build and train the machine learning model to learn the sequence of the textual content. Which type of neural network should you use?
Multilayer Perceptions (MLPs)
Convolutional Neural Networks (CNNs)
Recurrent Neural Networks (RNNs)
Generative Adversarial Networks (GANs)
You create a binary classification model.You need to evaluate the model performance. Which two metrics can you use? Each correct answer presents a complete solution.NOTE: Each correct selection is worth one point.
relative absolute error
precision
accuracy
mean absolute error
coefficient of determination
HOTSPOT -
You have an Azure blob container that contains a set of TSV files. The Azure blob container is registered as a datastore for an Azure Machine Learning service
workspace. Each TSV file uses the same data schema.
You plan to aggregate data for all of the TSV files together and then register the aggregated data as a dataset in an Azure Machine Learning workspace by using the
Azure Machine Learning SDK for Python.
You run the following code.
from azureml.core.workspace import Workspace
from azureml.core.datastore import Datastore
from azureml.core.dataset import Dataset
import pandas as pd
datastore_paths = (datastore, ‘./data/ *. tsv’ )
myDataset_1 = Dataset.File.from_files (path=datastore_paths)
myDataset_2 = Dataset. Tabular.from_delimited_files (path=datastore_paths, separator=’\t’ )
For each of the following statements, select Yes if the statement is true. Otherwise, select No.
NOTE: Each correct selection is worth one point.
Hot Area:
Answer Area
The myDataset_1 dataset can be converted into a pandas
dataframe by using the following method:
using myDataset_1.to_pandas_dataframe ()
The myDataset_1.to_path() method returns an array of file
paths for all of the TSV files in the dataset.
The myDataset_2 dataset can be converted into a pandas
dataframe by using the following method:
myDataset_2.to_pandas_dataframe ()
You create a multi-class image classification deep learning model that uses a set of labeled images. You create a script file named train.py that uses the PyTorch 1.3 framework to train the model.
You must run the script by using an estimator. The code must not require any additional Python libraries to be installed in the environment for the estimator. The time required for model training must be minimized.
You need to define the estimator that will be used to run the script.
Which estimator type should you use?
TensorFlow
PyTorch
SKLearn
Estimator
Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
You create a model to forecast weather conditions based on historical data.
You need to create a pipeline that runs a processing script to load data from a datastore and pass the processed data to a machine learning model training script.
Solution: Run the following code:
Does the solution meet the goal?
Yes
No
You create a multi-class image classification deep learning model that uses the PyTorch deep learningframework.
You must configure Azure Machine Learning Hyperdrive to optimize the hyperparameters for the classification model.
You need to define a primary metric to determine the hyperparameter values that result in the model with the best accuracy score.
Which three actions must you perform? Each correct answer presents part of the solution. NOTE: Each correct selection is worth one point.
Set the primary_metric_goal of the estimator used to run the bird_classifier_train.py script to maximize.
Add code to the bird_classifier_train.py script to calculate the validation loss of the model and log it as a float value with the key loss.
Set the primary_metric_goal of the estimator used to run the bird_classifier_train.py script to minimize.
Set the primary_metric_name of the estimator used to run the bird_classifier_train.py script to accuracy.
Set the primary_metric_name of the estimator used to run the bird_classifier_train.py script to loss.
Add code to the bird_classifier_train.py script to calculate the validation accuracy of the model and log it as a float value with the key accuracy.
You are with a time series dataset in Azure Machine Learning Studio.
You need to split your dataset into training and testing subsets by using the Split Data module.
Which splitting mode should you use?
Regular Expression Split
Split Rows with the Randomized split parameter set to true
Relative Expression Split
Recommender Split
DRAG DROP -
An organization uses Azure Machine Learning service and wants to expand their use of machine learning.
You have the following compute environments. The organization does not want to create another compute environment.
Environment name Compute type
nb_server Compute Instance
aks_cluster Azure Kubernetes Service
mlc_cluster Machine Learning Compute
You need to determine which compute environment to use for the following scenarios.
Which compute types should you use? To answer, drag the appropriate compute environments to the correct scenarios. Each compute environment may be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll to view content.
NOTE: Each correct selection is worth one point.
Select and Place:
Environments
nb_server
aks_cluster
mlc_cluster
Answer Area - Scenario
Run an Azure Machine Learning Designer training pipeline.
Deploying a web service from the Azure Machine Learning designer.
Environment
[Drop-down for 1st scenario]
[Drop-down for 2nd scenario]
You register a model that you plan to use in a batch inference pipeline.
The batch inference pipeline must use a ParallelRunStep step to process files in a file dataset. The script has the ParallelRunStep step runs must process six input files each time the inferencing function is called.
You need to configure the pipeline.
Which configuration setting should you specify in the ParallelRunConfig object for the PrallelRunStep step?
process_count_per_node= “6”
node_count= “6”
mini_batch_size= “6”
error_threshold= “6”
HOTSPOT -
You create an experiment in Azure Machine Learning Studio. You add a training dataset that contains 10,000 rows. The first 9,000 rows represent class 0 (90 percent).
The remaining 1,000 rows represent class 1 (10 percent).
The training set is imbalances between two classes. You must increase the number of training examples for class 1 to 4,000 by using 5 data rows. You add the
Synthetic Minority Oversampling Technique (SMOTE) module to the experiment.
You need to configure the module.
Which values should you use? To answer, select the appropriate options in the dialog box in the answer area.
NOTE: Each correct
Hot Area:
Answer Area
🔽 SMOTE
Label column
Selected columns: All labels
[Launch column selector button]
SMOTE percentage
Dropdown options:
0
300
3000
4000
Number of nearest neighbors
Dropdown options:
0
1
5
4000
Random seed: 0
DRAG DROP -
You configure a Deep Learning Virtual Machine for Windows.
You need to recommend tools and frameworks to perform the following:
✑ Build deep neural network (DNN) models
✑ Perform interactive data exploration and visualization
Which tools and frameworks should you recommend? To answer, drag the appropriate tools to the correct tasks. Each tool may be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll to view content.
NOTE: Each correct selection is worth one point.
Select and Place:
Tools
Vowpal Wabbit
PowerBI Desktop
Azure Data Factory
Microsoft Cognitive Toolkit
Answer Area
Task Tool
Build DNN models [Tool]
Enable interactive data exploration and visualization [Tool]
HOTSPOT -
You are working on a classification task. You have a dataset indicating whether a student would like to play soccer and associated attributes. The dataset includes the
following columns:
Name Description
IsPlaySoccer Values can be 1 and 0.
Gender Values can be M or F.
PrevExamMarks Stores values from 0 to 100
Height Stores values in centimeters
Weight Stores values in kilograms
You need to classify variables by type.
Which variable should you add to each category? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Hot Area:
Answer Area
Category Variables
Categorical variables:
Gender, IsPlaySoccer
Gender, PrevExamMarks, Height, Weight
PrevExamMarks, Height, Weight
IsPlaySoccer
Continuous variables:
Gender, IsPlaySoccer
Gender, PrevExamMarks, Height, Weight
PrevExamMarks, Height, Weight
IsPlaySoccer
DRAG DROP -
You create a training pipeline using the Azure Machine Learning designer. You upload a CSV file that contains the data from which you want to train your model.
You need to use the designer to create a pipeline that includes steps to perform the following tasks:
✑ Select the training features using the pandas filter method.
✑ Train a model based on the naive_bayes.GaussianNB algorithm.
✑ Return only the Scored Labels column by using the query
✑ SELECT [Scored Labels] FROM t1;
Which modules should you use? To answer, drag the appropriate modules to the appropriate locations. Each module name may be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll to view content.
NOTE: Each correct selection is worth one point.
Select and Place:
Modules
Create Python Model
Train Model
Two Class Neural Network
Execute Python Script
Apply SQL Transformation
Select Columns in Dataset
Answer Area (Pipeline)
training-data →
Select Columns in Dataset (likely module to apply) →
Split Data →
One output goes to Train Model
The other goes directly to Score Model
Train Model ←
Receives input from the selected module (likely Two Class Neural Network)
Outputs to Score Model
Score Model →
Final step: (Empty module box — possibly Evaluate Model)
You register a file dataset named csvjolder that references a folder. The folder includes multiple com ma-separated values (CSV) files in an Azure storage blob container. You plan to use the following code to run a script that loads data from the file dataset.
You create and instantiate the following variables:
You have the following code:
You need to pass the dataset to ensure that the script can read the files it references.
Which code segment should you insert to replace the code comment?
inputs=[file_dataset.as_named_input(‘training_files’).to_pandas_dataframe()],
inputs=[file_dataset.as_named_input(‘training_files’).as_mount()],
script_params={‘–training_files’: file_dataset},
inputs=[file_dataset.as_named_input(‘training_files’)],
HOTSPOT -
You are developing a linear regression model in Azure Machine Learning Studio. You run an experiment to compare different algorithms.
The following image displays the results dataset output:
Results Table
Algorithm Mean Absolute Error Root Mean Squared Error Relative Absolute Error Relative Squared Error
Bayesian Linear 3.276025 4.655442 0.511436 0.282138
Neural Network 2.676538 3.621476 0.417847 0.17073
Boosted Decision Tree 2.168847 2.878077 0.338589 0.107831
Linear 6.350005 8.720718 0.99133 0.99002
Decision Forest 2.390206 3.315164 0.373146 0.14307
Use the drop-down menus to select the answer choice that answers each question based on the information presented in the image.
NOTE: Each correct selection is worth one point.
Answer Area
Question 1:
Which algorithm minimizes differences between actual and predicted values?
Options:
Bayesian Linear Regression
Neural Network Regression
Boosted Decision Tree Regression
Linear Regression
Decision Forest Regression
Question 2:
Which approach should you use to find the best parameters for a Linear Regression model for the Online Gradient Descent method?
Options:
Set the Decrease learning rate option to True.
Set the Decrease learning rate option to False.
Set the Create trainer mode option to Parameter Range.
Increase the number of epochs.
Decrease the number of epochs.
You use Azure Machine Learning designer to create a real-time service endpoint. You have a single Azure Machine Learning service compute resource. You train the model and prepare the real-time pipeline for deployment You need to publish the inference pipeline as a web service.
Which compute type should you use?
HDInsight
Azure Databricks
Azure Kubernetes Services
the existing Machine Learning Compute resource
a new Machine Learning Compute resource