L3 Flashcards by jolyn Unknown

What type of learning does KNN fall under?

Supervised Learning

learning algorithm provided with input-output pairs

KNN uses input-output pairs for classification and regression tasks.

How well did you know this?

Not at all

Perfectly

What are the two main tasks of Supervised Learning?

Classification: Output discrete, e.g. class labels.
Regression: Output continuous, e.g. predicting a value like salary or temperature.

How well did you know this?

Not at all

Perfectly

What does KNN stand for?

K-Nearest Neighbors

How well did you know this?

Not at all

Perfectly

What is a defining characteristic of KNN as a learning algorithm?

Non-parametric, instance-based, lazy learning algorithm.

How well did you know this?

Not at all

Perfectly

What does ‘non-parametric’ mean in the context of KNN?

Makes no assumptions about the form of the mapping function.

How well did you know this?

Not at all

Perfectly

What does ‘lazy learning’ imply in KNN?

No explicit training phase; algorithm stores the entire dataset and computes output only when a query is made.

How well did you know this?

Not at all

Perfectly

How does KNN classify or predict the output for a new data point?

Compute distance between the new point and all points in the training dataset.
Identify the k nearest training examples
Use these neighbors to determine the output:
- Classification: Take the majority vote
- Regression: Take the mean (or sometimes median) of the target values.

How well did you know this?

Not at all

Perfectly

What is the method used by KNN for classification?

Take the majority vote of the k nearest training examples.

How well did you know this?

Not at all

Perfectly

For regression in KNN, what is the method used to determine the output?

Take the mean (or sometimes median) of the target values.

How well did you know this?

Not at all

Perfectly

What is a decision boundary in KNN?

A line or curve that separates different classes in the feature space.

Linear: Formed in simple models (e.g., logistic regression).
Non-linear: Formed in KNN with low k due to sensitivity to local data patterns.

How well did you know this?

Not at all

Perfectly

What happens to the decision boundary in KNN with low k?

Forms a complex boundary, leading to overfitting.

KNN forms complex and non-linear boundaries depending on the value of k and data distribution.

How well did you know this?

Not at all

Perfectly

What happens to the decision boundary in KNN with high k?

Forms a smooth boundary, leading to underfitting.

KNN forms complex and non-linear boundaries depending on the value of k and data distribution.

How well did you know this?

Not at all

Perfectly

What is the difference between classification and regression in KNN?

Classification: Assign class label; model trained from the data defines a decision boundary that separates the data; discrete labels (e.g. cat / dog); based on majority vote of neighbours
Regression: Predict numeric target; model fits the data to describe the relation between 2 features or between a feature (e.g., height) and the label (e.g., yes/no); based on average of neighbour values

How well did you know this?

Not at all

Perfectly

What is the hyperparameter k in KNN?

Represents the number of labeled neighbors to consider.

How well did you know this?

Not at all

Perfectly

What is the risk associated with a small value of k (e.g., k=1)?

High variance, very flexible, risk of overfitting.

How well did you know this?

Not at all

Perfectly

What is the risk associated with a large value of k (e.g., k=N)?

Study These Flashcards

Low variance, oversmooth, risk of underfitting.

**k = N: since all datapoints are considered, the predicted label for a test point will always be the the majority label of all datapoints. Equivalent to a majority classifier.

What is the effect of ties in KNN classification?

Study These Flashcards

Random selection from the tied labels is common.

**Ties: in case of a tie between predicted labels, there are different possibilities. The most common one is random selection from the tied labels.

What is the recommended value for k in relation to the number of training samples?

Study These Flashcards

Generally, k = √n (n = number of training samples).

**Use odd values for binary classification to avoid ties.
**Use cross-validation on a validation set to choose optimal k.

What distinguishes weighted KNN from standard KNN?

Study These Flashcards

In standard KNN, each neighbor contributes equally VS In weighted KNN, neighbors closer to the test point contribute more to the decision.

Effect: Improves performance, especially when data is dense and neighbors vary in quality.
**with distance weighting, k=n is no longer equivalent to a majority based classifier

What are the types of weighting used in weighted KNN?

Study These Flashcards

Inverse distance - each point has a weight equal to the inverse of its distance to the point to be classified (neighboring points have a higher vote)
Inverse squared distance
Kernel functions (e.g., Gaussian kernel)

What does the distance function determine in KNN?

Study These Flashcards

How ‘closeness’ is measured between points.

What are the two distance types mentioned in KNN?

Study These Flashcards

Euclidean - straight line (use for continuous numeric data) –> sqrt (x2-x1)^2 + (y2-y1)^2
Manhattan - dist btwn projections on axis (x_1 - x_2)

Effect:
Different distance metrics can change which points are considered neighbors.
Affects classification and regression outcomes.

What is the Nearest Centroid Classifier?

Study These Flashcards

For each class, compute the centroid (mean vector of all feature values) and classify new instances into the class whose centroid is closest (using Euclidean distance).

What is the purpose of the Nearest Shrunken Centroid?

Study These Flashcards

To reduce the effect of irrelevant or noisy features.

Each class centroid is shrunk toward the overall mean by a threshold.
Helps reduce the effect of irrelevant or noisy features.
Common in high-dimensional spaces, e.g., gene expression, text data

What is the effect of the Curse of Dimensionality on KNN?

As the number of features increases, distance between points becomes less meaningful Distance between points becomes less meaningful. All points appear to be at a similar distance. KNN performance deteriorates. Example: 1 feature: 10 values → 10 combinations 2 features: 10 × 10 = 100 combinations 3 features: 1000 combinations ➤ Exponential growth of feature space.

What is KNN's approach for missing value imputation?

Identifying k nearest rows and filling missing value using the mean or mode. ✔ Particularly effective when the feature with missing values is strongly correlated with others.

What is the effect of low k on model complexity?

High complexity, fits noise; low bias and high variance.

What is the effect of high k on model complexity?

Low complexity, smooth model; high bias and low variance.

What is a significant advantage of KNN?

1. No training phase; just stores the data. 2. Adaptable to all types -> Handles both classification and regression 3. Works with non-linear data -> Captures complex patterns using local neighborhoods 4. Easy to implement -> Algorithm is straightforward 5. Flexible distance metric -> Custom distance functions for different data types

What is a significant disadvantage of KNN?

1. High computation during inference; must compute distance to every training sample. 2. No interpretability -> No clear model or decision rule 3. Sensitive to noisy data -> Especially when k is small 4. Poor in high dimensions -> Suffers from the curse of dimensionality 5. Requires feature scaling -> Features must be on the same scale (e.g., standardization or normalization)

What does KNN require for feature scaling?

Features must be on the same scale (e.g., standardization or normalization).

What about tuning for model complexity and generalization

- Use cross-validation to select k that generalizes well. - Don’t evaluate performance on the test set while tuning → data leakage.

What does overfitting mean in terms of k?

Very small k, high sensitivity to noise; Poor generalization

What does underfitting mean in terms of k?

Very large k, too smooth boundary; Misses important patterns

What is Nearest-neighbour classifier?

Given a set of labeled instances (training set), new instances (test set) are classified according to their nearest labeled neighbour

What is KNN regression?

Instead of voting → Predict a continuous value by averaging the target values of the k-nearest neighbors. **k-NN classification combines the discrete predictions of k-neighbours / k-NN regression combines continuous predictions / k-NN regression fits the best line between the neighbors

L3 Flashcards

(36 cards)