Kernels Flashcards
(8 cards)
Can we use kernels as similarity functions?
Yes - by using kernels as similarity functions and comparing the output of a support vector to a new unseen example we can give a higher importance to the output of a similar support vector.
What does the similarity metric need to be?
It needs to be symmetric and correspond to the inner product in some embedding
What are the mercer conditions?
Consider any finite set of points x(1),…,x(n) which doesn’t have to be in the training set. The Gram matrix is an MxM similarity matrix k whose elements are Kij = K(x(i), x(j)).
Mercer conditions states that k must be:
- Symmetric: K(x(i), x(j)) = K(x(j), x(i))
- Positive semidefinite: zTkz >= 0 ∀z∈R^M
If these conditions are satisfied, the inner product defined by the kernel in the feature space respects the properties of inner products.
What are the kernel composition rules?
k(x, z) = ck1(x, z) where c ≥ 0 is a constant.
k(x, z) = f(x)k1(x, z)f(z) where f(⋅) is any function.
k(x, z) = q(k1(x, z)) where q(.) is a polynomial with
non-negative coefficients
k(x, z) = ek1(x,z)
k(x, z) = k1(x, z) + k2(x, z)
k(x, z) = k1(x, z) + c where c ≥ 0 is a constant
k(x, z) = k1(x, z)k2(x, z)
What is the Gaussian kernel?
k(x, x(n)) = e ^ (−∥x − x(n)∥ ^2 / 2σ^2)
How can we prove the Gaussian is a valid kernel?
The Gaussian can be represented using the Tylor Polynomial with infinite terms, which using the kernel composition rules can be validated as a kernel.
What dimensional imbedding does the Gaussian kernel give?
Infinite
Why use the Gaussian kernel?
The Gaussian kernel can be used as a similarity measure and can be used to represent a large number of non-linear functions because it has infinite dimensions.