Soft Margin SVMs Flashcards

Question 1

Q

What causes overfitting?

Answer

A

Overfitting happens when we fit noise into our training data, which worsens the generalisation. Can also be caused by high dimensional embeddings.

Question 2

Q

Why would we misclassify data?

Answer

A

To reduce overfitting instead of maximising the margin

Question 3

Q

What are slack variables ξ ?

Answer

A

Slack variables tell us how much an example can be within the margin or on the wrong side of the decision boundary

Question 4

Q

What does the value of the slack variable ξ tell us?

Answer

A

0 = on or past the margin
(0,1) = within the margin
1 = on the decision boundary
> 1 is misclassified

Question 5

Q

What is the constraint when maximising the margin using slack variables?

Answer

A

y(n)h(x(n)) >= 1 - ξ(n)

Question 6

Q

What does the hyperparameter c represent in the new margin optimisation algorithm?

Answer

A

C is the trade off between the amount of slack and the margin

Question 7

Q

What is the new margin optimisation algorithm with slack introduced? What are the constraints?

Answer

A

argmin{1/2 ∥w∥^2 + C Σξ(n)}

y(n)h(x(n)) >= 1 - ξ(n), ξ(n) >= 0

Question 8

Q

How does changing the value of c change the optimisation problem?

Answer

A

A smaller c allowed for misclassification, and better generalisation

Question 9

Q

What is a soft margin?

Answer

A

When we allow slack > 0

Question 10

Q

What is the constraint after using Lagrange relaxation for soft SVMs?

Answer

A

1 - ξ (n) - y(n)h(x(n)) <= 0
(where h(x(n)) = wTϕ(x(n)) + b)
-ξ <= 0

Question 11

Q

What is the dual formulation for soft SVMs?

Answer

A

max a,β min w,b,ξ {1/2 ∥w∥^2 +C Σ ξ(n)+ Σa(n)(1 − ξ(n) − y(n)(wTϕ(x(n)) + b)) − Σ β(n)ξ(n)}

a(n) >= 0, β(n) >= 0

Question 12

Q

What is the dual formulation for soft SVMs after replacing w and b? (i.e at the optimum when a and β are fixed)

Answer

A

argmax L(a) = Σ a(n) - 1/2 Σ Σ a(n)a(m) y(n)y(m) k(x(n)x(m)

where 0<= a(n) <= C, Σa(n)y(n) = 0

Question 13

Q

How do we replace w and b in soft margin SVMs? How do we remove β?

Answer

A

Replacing w and b is the same as hard SVMs:

Σ a(n)y(n) = 0 and w = Σ a(n)y(n)ϕ(x(n))

To remove β:

C − a(n) − β(n) = 0 so β(n) = C − a(n)

Question 14

Q

How do we know if a variable is a support variable when slack is added?

Answer

A

With slack variables, when a(n) > 0 and y(n)h(x(n)) = 1 - ξ(n) , the example is on the margin, in the margin or misclassified.

If ξ = 0, then it is on the margin

Question 15

Q

What do we do when there are no support vectors on the margin for calculating b?

Answer

A

We can calculate b based on any support vectors, which now depend upon ξ(n)

Question 16

Q

What does a(n) say about the position of the training point for soft margin SVMs?

Answer

Study These Flashcards

A

Using the KKT conditions, we get:
- 0 < a(n) < C iff y(n)(h(x(n)) = 1 and it is on the margin
- a(n) = c iff y(n)(h(x(n)) <= 1 and it is on or violating the margin
- a(n) = 0 iff y(n)(h(x(n)) >= 1 and its not a support vector

Soft Margin SVMs Flashcards

(16 cards)