Learning Theory Flashcards

Question 1

Q

What PAC stands for? What is the general idea behind it?

Answer

A

PAC stands for probable approximately correct.

It is a framework to quantify how many examples are enougth to guarantee that a learner will generalize well

Question 2

Q

What does the framework considers?

Answer

A

A set of instances X
A hypothesis space H (set of functions)
A concept c chosen from a set C
Training data, usually of the format (c(x), x), drawn from an unknown distribution P over X

Question 3

Q

When a model is said to be PAC learnable?

Answer

A

A model is said to be PAC learnable if, for any target concept, any distribution, and for parameters epsilon and gamma greater than zero and lower than .5, the learner can, with probability 1 - gamma, output a hypothesis with error at most epsilon, using a number of examples that is polynomial in the inverse of epsilon and gamma greater than

Question 4

Q

When learners are said to form a Version Space? Why is it important?

Answer

A

A learner is said to form a version space when the learner can achieve zero training error. When this happens all hypothesis that are consistent with the training data form the version space.

Version Space is important because a given model will output a hypothesis h with smallest training error values.

Question 5

Q

What is the bounding generalization concept? How can we derive the number of examples required for a given learner with it?

Answer

A

The bound generalization concept states that we can bound the probability that there exists a hypothesis in the version space with true error exceeding a threshold epsilon. This is done with the size of the hypothesis space H, the threshold value epsilon, and the number of training data N.

Setting the above probability to be att most gamma leads to a bound on the number of examples N required.

Question 6

Q

What is the agnostic learning concept? How the generalization bound changes under this context?

Answer

A

The agnostic learning concept considers the learners that do not form a version space (no hypothesis in H will perfectly classify all examples)

Now the boundary is built over the difference between true error and training error. The number of examples to ensure that, with high probability, the hypothesis selected by the learner has low generalization error is drawn using the same process.

Question 7

Q

Define the VC-Dimension concept and how it is important to the learning theory.

Answer

A

VC-dimension is the size of the largest set that can be shattered by H.

Its importance is related to the possibility of the hypothesis space dimension be infinite, what would make the estimation of the number of examples impossible.

Question 8

Q

How does the risk bound encapsulates the bias-variance trade-off?

Answer

A

With a high capacity hypothesis space (large VC dimension, which increases with the number of features), the bound becomes looser (higher variance)

Learning Theory Flashcards

(8 cards)