VC Dimension Flashcards
(9 cards)
What are dichotomies?
Dichotomies are hypothesis limited to the eyes of inputs. We look at the maximum kind of lines w.r.t N inputs, which is <= 2^N. When the change in Eout and Ein is small, the two hypothesis are similar.
What is the growth function to calculate the maximum number of dichotomies.
mH(n).
The larger mH(n), the more expensive or more complicated. Dependent on H and N
When does H shatter x1,..,xn?
When the hypothesis set is able to generate all 2^N dichotomies.
What is the break point for dichotomies?
If no k inputs can be shattered by H, k is a break point for H.
e.g. for 2D perceptron’s, the break point is N = 4
Why is mH(n) not a good replacement for M?
Because it is exponential. However, when N is large we can change the upper bound such that its polynomial
What is the Ein Eout inequality using the growth function?
P[|Ein(g) - Eout(g) > ε|] <= 4 mH(2N) exp(-1/8 ε^2 N)
If k exists, we can replace the bound with 4(2N)^k-1 exp(-1/8 ε^2 N)
What is the VC dimension of a hypothesis?
dvc(H) is the largest value for N for which H can shatter all N training examples. It is the maximum non-breaking point
What does a high dvc(H) mean?
A high dvc means a higher model complexity but smaller in-sample error