Sequential Minimisation Optimisation (SMO) Flashcards

Question 1

Q

What is a reason why we might not use gradient descent or Newton-Raphson methods?

Answer

A

Because they might break constraints and can be too costly.

Question 2

Q

What is sequential minimisation optimisation (SMO)?

Answer

A

Sequential Minimisation Optimisation breaks the problem into smaller sub-problems that can be solved analytically.

Question 3

Q

What do we use to decide what problem to solve first for SMOs?

Answer

A

We use heuristics to decide which problem to solve first

Question 4

Q

How do we update values without breaking constraints?

Answer

A

Updating one value will break the constraint Σ a(n)y(n) = 0, so we need to update two values to act as counter weights

Question 5

Q

How do we run the SMO?

Answer

A

Initialise a to any value that satisfies the constraints or such that a(n) = 0, ∀n ∈ {1,..,N} and for a maximum number of iterations optimise L(a) w.r.t a(i) and a(j)

Question 6

Q

What are the lower and upper bounds for SMO?

Answer

A

If y(i) = y(j):

L = max(0, a(i) + a(j) - C
H = min(C, a(j) + a(i)

If y(i) != y(j):

L = max(0, a(i) - a(j))
H = min(C, C+a(j)-a(i))

Question 7

Q

What is the formula for a(j,new)

Answer

A

a(j,new) = a(j) + ((y(i)(E(i)-E(j))) / (k(x(i),x(i) + k(x(j),x(j) + 2k(x(i),x(j))

where E(i) = h(x(i)) - y(i) is the error

Question 8

Q

How do we calculate a(j,new,clipped)

Answer

A

a(j,new,clipped) = {
- H if a(j,new) >= H
- L uf a(j,new) <= L
- a(j,new) otherwise

Question 9

Q

What is the formula for a(i,new)

Answer

A

(ζ - a(j,new,clipped)y(i)) / (y(i))

where ζ= a(i)y(i) + a(j)y(j)

Question 10

Q

How do we pick the Lagrange multiplier a(i) to update in SMO?

Answer

A

a(i) can be picked by randomly selecting corresponding training examples that violate KKT conditions with a certain margin of error e around y(n)(h(x(n)) or that violate KKT conditions given the margin e and 0 < a(i) < c

Question 11

Q

How do we pick the Lagrange multiplier a(j) to update is SMO?

Answer

A

a(j) is picked that will obtain the largest change in a(j) based on |E(i) - E(j)|. If this doesn’t lead to a positive improvement in the objective, we pick a new a(j) or eventually a new a(i)

Sequential Minimisation Optimisation (SMO) Flashcards

(11 cards)