naive bayes 2 Flashcards

1
Q

What is the key assumption made by the Naïve Bayes algorithm?
A. All variables are numerical
B. Predictor variables are independent
C. Predictor variables are correlated
D. The data must be normally distributed

A

Answer: B. Predictor variables are independent
Explanation: Naïve Bayes assumes that the predictors are conditionally independent given the class label. This is known as the “naïve” assumption.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Which of the following is NOT a strength of Naïve Bayes?
A. It works well with high-dimensional data
B. It handles missing data naturally
C. It’s fast and easy to implement
D. It requires large amounts of training data

A

Answer: D. It requires large amounts of training data
Explanation: Naïve Bayes is effective even with small datasets and is easy to train, but its independence assumption can be limiting.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What should be done to numerical variables before using Naïve Bayes?
A. Normalize them
B. Drop them
C. Bin and convert to categorical
D. Standardize them

A

Answer: C. Bin and convert to categorical
Explanation: Naïve Bayes requires categorical inputs, so numeric data must be binned into categories.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

In the given classification example, what is the final classification result for the day described as Sunny, Cold, High Humidity, and Windy?
A. Play
B. No Play

A

Answer: B. No Play
In the image P no play .9 > p of no play .2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Which of the following problems can occur in Naïve Bayes if a predictor value never occurs with a certain class in the training data?
A. Overfitting
B. Bias-variance tradeoff
C. Zero conditional probability
D. Missing data issue

A

Answer: C. Zero conditional probability
Explanation: If a predictor value never appears with a class in training data, the probability becomes 0, which breaks the classification calculation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the purpose of using a probability cutoff in Naïve Bayes classification?
A. To improve the independence assumption
B. To reduce the number of predictors
C. To handle continuous variables
D. To define the threshold for class assignment

A

D. To define the threshold for class assignment
Explanation: A cutoff (e.g., 0.5) allows classification decisions to be based on whether predicted probabilities exceed a certain threshold.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What type of learning does Naïve Bayes fall under?
A. Reinforcement learning
B. Unsupervised learning
C. Supervised learning
D. Semi-supervised learning

A

C. Supervised learning
Explanation: It uses labeled training data to learn how to classify new instances.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Which real-world application was mentioned as an example of Naïve Bayes usage?
A. Stock price prediction
B. Spell check programs
C. Video recommendation systems
D. Customer segmentation

A

B. Spell check programs
Explanation: Naïve Bayes can classify misspelled words into the most probable correct word class.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Why is Naïve Bayes considered “naïve”?
A. It is not effective on real data
B. It assumes perfect predictions
C. It assumes independence between features
D. It is an outdated method

A

C. It assumes independence between features
Explanation: Despite often being inaccurate, the independence assumption simplifies computation significantly.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Which of the following best describes how Naïve Bayes classifies a new record?
A. It finds exact matches in the training data.
B. It uses the average of the nearest neighbors.
C. It calculates the likelihood of each class given the predictors.
D. It builds a decision tree from training data.

A

Answer: C. It calculates the likelihood of each class given the predictors.
Explanation: Naïve Bayes uses Bayes’ Theorem to calculate P(Class∣X), then selects the class with the highest posterior probability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

In Naïve Bayes, how is the joint probability P(X∣Class) typically computed?
A. As the product of individual conditional probabilities
B. Using a regression function
C. Through k-nearest neighbors
D. By summing class frequencies

A

Answer: A. As the product of individual conditional probabilities
Explanation: Naïve Bayes assumes feature independence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

When applying Naïve Bayes, what is one way to handle the zero-probability problem?
A. Increase the cutoff threshold
B. Use Laplace smoothing
C. Use fewer predictor variables
D. Convert all data to numerical

A

Answer: B. Use Laplace smoothing
Explanation: Laplace smoothing adds a small constant to frequency counts to prevent any probability from being zero.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What type of variables are required to apply Naïve Bayes directly?
A. Only numerical
B. Only binary
C. Categorical
D. Ordinal

A

Answer: C. Categorical
Explanation: Naïve Bayes needs predictors to be categorical; numerical variables must be binned beforehand.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Suppose a predictor value never occurred with a class label in training. What impact does this have?
A. Makes that record unclassifiable
B. Reduces model accuracy slightly
C. Makes the posterior probability for that class zero
D. Causes overfitting

A

C. Makes the posterior probability for that class zero
Explanation: If any conditional probability is zero, the whole product becomes zero, eliminating the class from consideration.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Which of the following assumptions makes Naïve Bayes “naïve”?
A. Data is normally distributed
B. Predictors are correlated
C. Predictors are conditionally independent given the class
D. The training data is large and diverse

A

Answer: C. Predictors are conditionally independent given the class
Explanation: This simplifying assumption makes calculations tractable but is rarely true in practice.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What does the denominator P(X) in Bayes’ Theorem represent?
A. The prior probability of the class
B. The likelihood of the class
C. The marginal probability of the observed predictor values
D. The sum of all prior probabilities

A

Answer: C. The marginal probability of the observed predictor values
Explanation: P(X) ensures that the posterior probabilities across all classes sum to 1.

17
Q

In a spam detection application, which of the following would be a “class”?
A. Frequency of the word “free”
B. Total number of emails
C. Label: Spam or Not Spam
D. Subject line length

A

Answer: C. Label: Spam or Not Spam
Explanation: The “class” is what the model is trying to predict — in this case, whether an email is spam.

18
Q
  1. What does it mean if P(Class∣X) is high?
    A. The model is confident that X belongs to that class
    B. X occurs frequently in the dataset
    C. The class is rare in the data
    D. The cutoff threshold is too high
A

Answer: A. The model is confident that X belongs to that class
Explanation: A higher posterior probability indicates higher confidence in the class prediction for record X.