KNN Flashcards
(36 cards)
What type of model is k-NN?
Non-parametric.
Does k-NN train model parameters like θ?
No, it stores the training data instead.
How does k-NN make predictions for classification?
By majority vote of the k closest training points.
How does k-NN make predictions for regression?
By averaging the outputs (y) of the k nearest points.
What does a small k value do in k-NN?
Makes the model highly flexible and sensitive to noise.
What happens when k is too small in k-NN?
The model overfits (low bias, high variance).
What happens when k is too large in k-NN?
The model underfits (high bias, low variance).
What is the most common strategy for handling ties in k-NN classification?
Use odd values of k to reduce tie probability.
What is the Euclidean distance formula?
√Σ(xᵢ - yᵢ)²
What is the Manhattan distance formula?
Σ|xᵢ - yᵢ|
When is cosine similarity useful in k-NN?
When direction matters more than magnitude (e.g. text data).
What is one-hot encoding used for in k-NN?
To handle categorical variables.
What distance metric can be used with one-hot encoded features?
Hamming distance.
What is Jaccard similarity used for?
Comparing overlap between sets.
What is the main drawback of k-NN?
Slow prediction and high memory usage.
What do k-NN prediction curves show in classification tasks?
How predicted probabilities change across the input space.
What does a sharp transition in a k-NN prediction curve indicate?
The model is highly sensitive to local data; likely overfitting.
What does a smooth transition in a k-NN prediction curve indicate?
The model is generalizing better; less variance.
What does a very flat k-NN prediction curve suggest?
The model is underfitting; it’s not responsive to class boundaries.
How does increasing k affect the shape of the k-NN decision boundary?
It makes the boundary smoother and less sensitive to individual points.
How does decreasing k affect the shape of the k-NN decision boundary?
It makes the boundary more complex and sensitive to noise.
What is the tradeoff shown in k-NN decision boundary plots?
The bias-variance tradeoff.
What kind of bias and variance do small k values typically produce?
Low bias, high variance.
What kind of bias and variance do large k values typically produce?
High bias, low variance.