KNN Flashcards

(36 cards)

1
Q

What type of model is k-NN?

A

Non-parametric.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Does k-NN train model parameters like θ?

A

No, it stores the training data instead.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How does k-NN make predictions for classification?

A

By majority vote of the k closest training points.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How does k-NN make predictions for regression?

A

By averaging the outputs (y) of the k nearest points.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does a small k value do in k-NN?

A

Makes the model highly flexible and sensitive to noise.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What happens when k is too small in k-NN?

A

The model overfits (low bias, high variance).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What happens when k is too large in k-NN?

A

The model underfits (high bias, low variance).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the most common strategy for handling ties in k-NN classification?

A

Use odd values of k to reduce tie probability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the Euclidean distance formula?

A

√Σ(xᵢ - yᵢ)²

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the Manhattan distance formula?

A

Σ|xᵢ - yᵢ|

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

When is cosine similarity useful in k-NN?

A

When direction matters more than magnitude (e.g. text data).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is one-hot encoding used for in k-NN?

A

To handle categorical variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What distance metric can be used with one-hot encoded features?

A

Hamming distance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is Jaccard similarity used for?

A

Comparing overlap between sets.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the main drawback of k-NN?

A

Slow prediction and high memory usage.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What do k-NN prediction curves show in classification tasks?

A

How predicted probabilities change across the input space.

17
Q

What does a sharp transition in a k-NN prediction curve indicate?

A

The model is highly sensitive to local data; likely overfitting.

18
Q

What does a smooth transition in a k-NN prediction curve indicate?

A

The model is generalizing better; less variance.

19
Q

What does a very flat k-NN prediction curve suggest?

A

The model is underfitting; it’s not responsive to class boundaries.

20
Q

How does increasing k affect the shape of the k-NN decision boundary?

A

It makes the boundary smoother and less sensitive to individual points.

21
Q

How does decreasing k affect the shape of the k-NN decision boundary?

A

It makes the boundary more complex and sensitive to noise.

22
Q

What is the tradeoff shown in k-NN decision boundary plots?

A

The bias-variance tradeoff.

23
Q

What kind of bias and variance do small k values typically produce?

A

Low bias, high variance.

24
Q

What kind of bias and variance do large k values typically produce?

A

High bias, low variance.

25
Why are odd values of k often preferred in classification?
To avoid tie votes between classes.
26
How is the best k usually chosen?
Using cross-validation.
27
What is a key advantage of k-NN?
It is simple and easy to implement.
28
Does k-NN require training?
No, it is a lazy learner that defers computation to prediction time.
29
Can k-NN adapt to complex decision boundaries?
Yes, especially with small k values.
30
Does k-NN work with both classification and regression?
Yes, it supports both.
31
Why is k-NN considered non-parametric?
Because it makes no assumptions about the data distribution.
32
What is a major disadvantage of k-NN?
Slow predictions, especially on large datasets.
33
Why is k-NN memory-intensive?
It stores the entire training dataset.
34
How does k-NN handle irrelevant features?
Poorly — irrelevant or unscaled features can mislead distance calculations.
35
Is k-NN sensitive to feature scaling?
Yes, features with larger ranges can dominate distance calculations.
36
What happens to k-NN performance in high-dimensional spaces?
It degrades due to the curse of dimensionality.