test set bounds Flashcards
(9 cards)
what does successful learning mean
resulting classifier ha rediction ability better then what we have seen at training
can not expect any prediction ability in general
why study prediction theory
to gain insight into how the mechine brains work (insights into learning)
better methods for learning
better methods for verifying machines predictive ability
standard technoque to verify learning succeeded
divide sample into training set and test set
we train on the training set
we test on the test set
what must be avoided by algorithm design
overfitting
why is the true error quantity unknown
the true distrbution in unknown
when is cd (true error) a random variable
if C (classifier ) is trained on sample set S then cd is a random variable of the function of C
when is cd (true error) deterministic
if c (classifier) is a fixed function then cd is deterministic
what does the pivot of the cumulative find
it ifnds the largest true error such that probability of observing K or fewer “heads” (errors) in n tirals (test points ) is at least δ(confidence level)
what does the text set bound tell us
- certify the extent to which the classifier has learnt correctly
- The guarantee is probabilistic - it only implies that the bound will not be wrong for a 1−δ fraction of its applications / i.e. training sets
- bound tigthtens with large sample sizes n
- relies on i.i.d samples
- used for both theorectical insights and practical applications
- simple and validity is well understood