Preparing Data for Feature Engineering and Machine Learning in Microsoft Azure Flashcards Preview

DP-100 - PS > Preparing Data for Feature Engineering and Machine Learning in Microsoft Azure > Flashcards

Flashcards in Preparing Data for Feature Engineering and Machine Learning in Microsoft Azure Deck (10)
Loading flashcards...
1

What issue you could possibly face with a credit card fraud detection dataset?

Problem of outliers

Problem of high-dimensionality

Problem of imbalanced data

Multicollinearity problem

Problem of imbalanced data

2

What happens when we increase the amount of data for a machine learning problem?

A. The training accuracy increases, test accuracy decreases

B. The training accuracy increases, test accuracy increases

C. The training accuracy decreases, test accuracy decreases

D. The training accuracy decreases, test accuracy increases

D. The training accuracy decreases, test accuracy increases

3

You can delete the records with missing values if the missing assumption is what?

Missing at Random

Missing Completely at Random

Either of MCAR or MAR

Missing not at Random

Missing Completely at Random

4

Which is the best method to use to handle missing data if the feature has outliers?

Mode imputation

Mean imputation

Listwise deletion

Median imputation

Median imputation

5

Which of the following Machine Learning models does not have any target value?

Clustering

Anomaly detection

Regression

Classification

Clustering

6

Which of the following machine learning models' target is a continuous value?

Regression

Classification

Anomaly detection

Clustering

Regression

7

Which of the following is the BEST way to create features for a high-cardinality categorical data?

One-hot encoding

Learning with counts

Dummy coding

Binning

Learning with counts

8

Which of the following is a disadvantage of linear models?

They run slower

They are not scalable

They may not give accurate predictions

They are harder to train

They may not give accurate predictions

9

What is TRUE about Leave-one-out cross validation?

It produces low bias and high variance models

It produces low bias and low variance models

It produces high bias and low variance models

It produces high bias and high variance models

It produces low bias and high variance models

10

Suppose you need to create 7 folds for K-fold Cross validation. How would you do it?

Use Partition and Sample module with 'Assign to folds' mode

Use Partition and Sample module with 'Pick folds' mode

Use Split data module to assign folds

Use the Cross-validate model module

Use Partition and Sample module with 'Assign to folds' mode