ML Foundation Set 2 (Answers) Flashcards

1
Q

Which of the following is NOT supervised learning?

a. PCA
b. Decision Tree
c. Linear Regression
d. Naive Bayesian

A

a. PCA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Which of the following statements about Naive Bayes is incorrect?

a. Attributes are equally important.
b. Attributes are statistically dependent of one another given the class value.
c. Attributes are statistically independent of one another given the class value.
d. Attributes can be nominal or numeric

A

b. Attributes are statistically dependent of one another given the class value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Among the following option identify the one which is not a type of machine learning

a. Semi unsupervised learning
b. Supervised learning
c. Reinforcement learning
d. unsupervised learning

A

a. Semi unsupervised learning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Identify the kind of learning algorithm for “facial identities and facial expressions”.

a. Prediction
b. Recognise patterns
c. Recognising anomalies
d. Generating patterns

A

b. Recognise patterns

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Identify the model which is trained with data in only a single batch.

a. Offline learning
b. Batch learning
c. Both A and B
d. None

A

c. Both offline learning and batch learning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the application of machine learning methods to a large database called?

a. Big data computing
b. Internet of things
c. Data mining
d. Artificial intelligence

A

c. Data mining

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Identify the type of learning in which labelled training data is used.

a. Clustering
b. Supervised learning
c. Reinforcement learning
d. unsupervised learning

A

b. Supervised learning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Identify whether true or false: In PCA the number of input dimensions is equal to principal components.

a. True
b. False

A

a. True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Among the following identify the one in which dimensionality reduction reduces.

a. Performance
b. Entropy
c. Stochastics
d. Collinearity

A

d. Collinearity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Which of the following machine learning algorithm is based upon the idea of bagging?

a. Decision tree
b. Random tree
c. SVM
d. Regression

A

b. Random tree

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Choose a disadvantage of decision trees:

a. Decision trees are robust to outliers
b. Factor analysis
c. Decision trees are prone to overfit
d. All of the above

A

c. Decision trees are prone to overfit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the term known as on which the machine learning algorithms build a model based on
sample data?

a. Data training
b. Training data
c. Transfer data
d. None of the above

A

b. Training data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Machine learning is a subset of which of the following.

a. Artificial intelligence
b. Deep learning
c. Data learning
d. None of the above

A

a. Artificial intelligence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Which of the following machine learning techniques helps in detecting the outliers in data?

a. Classification
b. Clustering
c. Anomaly detection
d. All of the above

A

c. Anomaly detection

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

The father of machine learning is _____________

a. Geoffrey Everest Hinton
b. Geoffrey Hill
c. Geoffrey Chaucer
d. None of the above

A

a. Geoffrey Everest Hinton

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

The most significant phase in genetic algorithm is _________

a. Mutation
b. Selection
c. Fitness function
d. Crossover

A

d. Crossover

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Which of the following are common classes of problems in machine learning?

a. Regression
b. Classification
c. Clustering
d. All of the above

A

d. All of the above

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Among the following options identify the one which is FALSE regarding regression.

a. It is used for prediction
b. It is used for interpretation
c. It relates inputs to outputs
d. It discovers casual relationships

A

d. It discovers casual relationships

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Identify the successful applications of ML.

a. Learning to classify new astronomical structures
b. Learning to recognize spoken words
c. Learning to drive an autonomous vehicle
d. All of these choices

A

d. All of these choices

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Identify the incorrect numerical functions in the various function representation of machine
learning.

a. Case-based
b. Support vector machines
c. Linear regression
d. Neural network

A

a. Case-based

21
Q

FIND-S algorithm ignores?

a. Positive
b. Negative
c. Both
d. None

A

b. Negative

22
Q

Neuro software is ______

a. It is software used by neurosurgeons
b. designed to aid experts in real world
c. it is powerful and easy neural network
d. a software used to analyze neurons

A

c. it is powerful and easy neural network

23
Q

Choose whether the following statement is true or false: The backpropagation law is also known
as the generalized Delta rule

a. True
b. False

A

a. True

24
Q

Choose the general limitations of the backpropagation rule among the following.

a. Slow convergence
b. Scaling
c. Local minima problem
d. All of these choices

A

d. All of these choices

25
Q

Analysis of ML algorithm needs

a. Statistical learning theory
b. Computational learning theory
c. Both A and B
d. None of the above

A

c. Both A and B

26
Q

Choose the most widely used metrics and tools to assess the classification models.

a. The area under the ROC curve
b. Confusion matrix
c. Cost-sensitive accuracy
d. All of the above

A

d. All of the above

27
Q

Identify the difficulties with the k-nearest neighbor algorithm.

a. Curse of dimensionality
b. Calculate the distance of the test case from all training cases
c. Both A and B
d. None of the above

A

c. Both A and B

28
Q

Which one of the following is also called as exploratory learning?

a. supervised learning
b. active learning
c. unsupervised learning
d. reinforcement learning

A

c. unsupervised learning

29
Q

In which of the following learning the teacher returns reward and punishment to learner?

a. active learning
b. reinforcement learning
c. supervised learning
d. unsupervised learning

A

b. reinforcement learning

30
Q

The output of training process in machine learning is __________

a. machine learning model
b. machine learning algorithm
c. null
d. accuracy

A

a. machine learning model

31
Q

What does K stand for in K mean algorithm?

a. Number of clusters
b. Number of data
c. Number of attributes
d. Number of iterations

A

a. Number of clusters

32
Q

Which of the following is/are true?

1.K-mean algorithm is for clustering of unlabeled data
2. KNN is unsupervised leaning
3. the k in KNN stands for the number of cluster
4. None of the above

a. 1 & 3
b. 1, 2 & 3
c. 1 only
d. 4 only

A

c. 1 only

33
Q

Which of the following is NOT a supervised learning?

a. Number of groups is known
b. Features of group explicitly stated
c. Neither features nor number of groups is known
d. none of the above

A

c. Neither features nor number of groups is known

34
Q

Which of the following is FALSE about SVM?

a. SVM aims to maximise the separation between the support vectors
b. SVM achieve non-linear separation by reducing the dimension with its kernel
c. Kernel add additional dimension to make linearly non-spearable data separable.
d. Slack variable can be used to control the amount of allowed training errors

A

b. SVM achieve non-linear separation by reducing the dimension with its kernel

35
Q

What are the applications of Natural Language Processing (NLP)?

a. spam detection
b. sentiment analysis
c. movie recommendations
d. all the above

A

d. all the above

36
Q

Which of the following is NOT an essential step for Deep Learning?

a. Data collection and labelling
b. Feature engineering
c. Model training
d. Model evaluation and fine tuning

A

b. Feature engineering

37
Q

Which of the following is NOT a machine learning algorithm?

a. SVM
b. SVG
c. Random Forest
d. None of the above

A

b. SVG

38
Q

Which of the following concerning categorical data is true?

a. We can use get_dummies method to convert ordinal data
b. for ordinal data the order of data carries information/weightage (for example primary, secondary,
tertiary etc)
c. gender (eg male and female) is ordinal data
d. we should use method replace with a dictionary to convert nominal data

A

b. for ordinal data the order of data carries information/weightage (for example primary, secondary,
tertiary etc)

39
Q

Which of the following statements is FALSE?

a. Underfitting is when a model learning very well about training data but not the test data
b. The model of overfitting tends to be more complex compared to an underfitting model.
c. Underfitting draw a very simple relationship among input features and the output target
d. Overfitting does not perform well on the test data

A

a. Underfitting is when a model learning very well about training data but not the test data

40
Q

What is the reason for data preprocessing?

a. Clean up missing data
b. Handling outliers
c. scale data to a suitable range
d. All the above

A

d. All the above

41
Q

How do you handle missing or corrupted data in a dataset?

a. Drop missing rows or columns
b. Replace missing values with mean/median/mode
c. Assign a unique category to missing values
d. All of the above

A

d. All of the above

42
Q

When performing regression or classification, which of the following is the correct way to preprocess the data?

a. Normalize the data -> PCA -> training
b. PCA -> normalize PCA output -> training
c. Normalize the data -> PCA -> normalize PCA output -> training
d. None of the above

A

a. Normalize the data -> PCA -> training

43
Q

Predicting whether a tumour is malignant or benign is an example of?

a. Unsupervised Learning
b. Supervised Regression Problem
c. Supervised Classification Problem
d. Categorical Attribute

A

c. Supervised Classification Problem

44
Q

Engineering a good feature space is a crucial ___ for the success of any machine learning model.

a. Pre-requisite
b. Process
c. Objective
d. None of the above

A

a. Pre-requisite

45
Q

The transformations applied to the identified data before feeding the same into the algorithm is called:

a. Problem Identification
b. Identification of Required Data
c. Data Pre-processing
d. Definition of Training Data Set

A

c. Data Pre-processing

46
Q

Which of the following are not classification problems?
(Choose two)

a. Predicting price of house
b. Predicting patient has tumor
c. Predicting who will hold the title in football league
d. Predicting percentage of student for next semester

A

a. Predicting price of house
c. Predicting who will hold the title in football league

47
Q

Out of 200 emails, a classification model correctly predicted 150 spam emails and 30 ham emails.
What is the accuracy of the model?

a. 10%
b. 90%
c. 75%
d. None of the above

A

b. 90%

48
Q

Which neural network architecture would be most suited to handle an image identification problem (recognizing a dog in a photo)?

a. Multi Layer Perceptron
b. Convolutional Neural Network
c. Recurrent Neural network
d. Perceptron

A

b. Convolutional Neural Network

49
Q

Price prediction in the domain of real estate is an example of?

a. Unsupervised Learning
b. Supervised Regression Problem
c. Supervised Classification Problem
d. Unsupervised regression problem

A

b. Supervised Regression Problem