K-Nearest Neighbours and Support Vector Machine Flashcards
(17 cards)
What is KNN?
A supervised learning algorithm frequently used for classification problems.
How is a new data point classified?
A new data points class will depend on its k-nearest neighbours
What are weaknesses of KNN?
It is susceptible to outliers and class imbalances.
What are two methods for determining k?
Method one starts with k = 1 and calculates performance metrics until an optimal K is found. Method two sets the value of k to the square root of the number of records in the training dataset.
What is weighted KNN?
A variant of KNN where the impact of nearer neighbours is more than neighbours that are further away.
What type of learning does KNN use?
Instance based learning
What are the best conditions for KNN?
Datasets of a limited size, with limited dimensionality, and easily scalable data.
What does SVM stand for?
Support Vector Machine
What does KNN stand for?
K-Nearest Neighbours
What is an SVM?
A supervised machine learning algorithm used for classification, regression, and clustering problems.
What is the aim of an SVM?
To find a line that separates data points by a margin
What is the margin in an SVM?
The margin is the distance between the closest pair of data points belonging to opposite classes
How are outliers handled?
Outliers are handled by allowing for misclassification so that the threshold becomes less sensitive to outliers.
What is a soft margin?
The distance between the observation and the threshold in a scenario where misclassification is allowed
How is non-linearly separable data handled?
A transformation is applied to the data to map it from its original feature space to a higher-dimensional feature space that allows them to be linearly separated.
What is the kernel trick?
A technique used by the SVM to calculate high-dimensional relationships without actually transforming the data. This reduces the computational cost.