Support Vector Machines (SVMs) Flashcards
(13 cards)
What are support vectors?
Training examples that are exactly on the margin
What is a consequence of having a short margin γ?
It leads to misclassification
Why do we maximise the margin?
To avoid overfitting
What is the margin?
The margin, γ, is the perpendicular distance between the decision boundary and the closest training example.
What do we assign xo in SVMs?
We don’t assign a value for xo in SVMs
What is the formula for the margin?
dist(h, x(n)) = |h(x(n))| / ∥w∥
where ∥w∥ = square root (wTw) is the Euclidean norm
What does w0 represent in SVMs?
Bias
What is the constraint when calculating the margin?
y(n).h(x(n)) > 0, ∀(x(n), y(n))∈𝒯
(All training examples have to be correctly classified)
What is the affect of scaling w and b on the hyperplane?
There is no affect on the position of the hyperplane
What changes can we make to the constraint for SVMs?
By dividing w and b by min n y(n) h(X(n)), we can set the constraint equal to 1:
min n y(n).h(x(n)) = 1, ∀(x(n), y(n))∈𝒯
Whats the equation for minimising the margin?
(for the constraint min n y(n).h(x(n)) = 1, ∀(x(n), y(n))∈𝒯)
argmax w,b = {1 / ∥w∥}
argmin w,b = {∥w∥}
How can we change the constraint so that the optimal solution satisfies for at least one training example?
min n y(n).h(x(n)) >= 1, ∀(x(n), y(n))∈𝒯
This is equivalent to the other constraint as the smallest value for ∥w∥ happens when the constraint is 1.
Whats the equation for minimising the margin?
(for the constraint min n y(n).h(x(n)) >= 1, ∀(x(n), y(n))∈𝒯)
argmin w,b = {1/2 . ∥w∥^2}