Uniform Convergence Flashcards

1
Q

Definition of ε-representative sample

A

A set is epsilon-representative (wrt domain Z, hypothesis class H, loss function l and distribution D) if for every h in H |Ls(h) - Ld(h)| <= epsilon

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Lemma 4.2 (ε/2-representative)

A

If S is epsilon/2 representative, then any output of the ERMh(S), for example hs in argmin Ls(h), satisfies:

Ld(hs) <= min Ld(h) + epsilon

It means that ERM return a good hypothesis.

Proof

For every h in H

  • Ld(hs) <= Ls(hs) + epsilon/2 …………………………….\ S is epsilon/2 representative
  • ……………<= Ls(h) + epsilon/2…………………………….\ since hs is picked by ERM (Ls(hs) <= Ls(h))
  • ……………<= Ld(h) + epsilon/2 + epsilon/2…………………………….\ S is epsilon/2 representative
  • ……………<= Ld(h) + epsilon

–> Ld(hs) <= min Ld(h) + epsilon

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Definition of uniform convergence property

A

H has the uniform convergence property if exists a function mh(uc) : (0,1)^2 –> N such that for all epsilon, delta in (0,1) and for every D over Z if S is a sample of m >= mh(uc)(epsilon, delta) iid examples generated by D, then with probability >= 1-delta is epsilon-representative.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Corollary 4.4 (uniform convergence property)

A

If H has the uniform convergence property with a function mh(uc) then the class is agnostically PAC learnable with the sample complexity mh(epsilon, delta) <= mh(uc) (epsilon/2, delta).
So, ERMh is a successful agnostic PAC learner for H.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Corollary 4.6 (mainly its bounds, never ask proof)

A

Every finite hypothesis class is agnostic PAC learnable under some restrictions on the loss.

Let H be a finite hypothesis class, let Z be a domain, and let l : H x Z –> [0,1], then

  • H enjoys the uniform convergence property with sample complexity

mh(uc) (epsilon, delta) <= ceiling(log(2|H|/delta)/2epsilon^2)

  • H is agnostically PAC learnable using the ERM algorithm with sample complexity

mh(epsilon, delta) <= mh(uc) (epsilon/2, delta) <= ceiling(2log(2|H|/delta)/epsilon^2)

Idea of the proof:
- prove that uniform convergence holds for a finite hypothesis class
- use previous result on uniform convergence and PAC learnability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly