19. Learning by Example (ML) Flashcards
(22 cards)
What is learning by example?
Learning by example involves inferring a function or decision boundary from input-output pairs (training data).
What is the goal of supervised learning?
To learn a function that maps inputs to outputs, minimizing error on unseen examples.
What is a hypothesis in machine learning?
A candidate function from a hypothesis space that approximates the target function.
What is the hypothesis space?
The set of all functions a learning algorithm can choose from to approximate the target function.
What does it mean to generalize?
To perform well on unseen data, not just the training set.
What is overfitting?
When a model fits training data too closely, capturing noise and failing to generalize.
What is underfitting?
When a model is too simple to capture the underlying pattern of the data.
What is the difference between training and test error?
Training error measures fit to seen data; test error measures performance on unseen data.
What is the version space?
The set of hypotheses consistent with all training examples.
What is inductive bias?
Assumptions a learner uses to generalize beyond the training data.
What is the inductive learning hypothesis?
Any hypothesis that approximates the target function well over the training set will also do well on unseen data.
What is a consistent learner?
A learner that only outputs hypotheses consistent with all training examples.
What is the Find-S algorithm?
It finds the most specific hypothesis consistent with the training data.
What are limitations of Find-S?
It only works for conjunctive hypotheses and ignores inconsistent data or noise.
What is the Candidate Elimination algorithm?
It maintains the version space by updating specific (S) and general (G) boundaries.
What happens to the version space with more data?
It shrinks, ideally converging toward the target concept.
What are the S and G sets in Candidate Elimination?
S contains the most specific hypotheses; G contains the most general ones consistent with data.
How are S and G updated in Candidate Elimination?
S is generalized on positive examples; G is specialized on negative examples.
What causes noise to be problematic in version space learning?
It can eliminate all hypotheses, as no consistent function may exist.
What is inductive learning vulnerable to?
Noise, limited data, and incorrect inductive bias.
How can hypothesis space design affect learning?
Too large → overfitting; too small → underfitting.
Why is inductive learning considered impossible without bias?
Because multiple hypotheses may explain training data—bias is needed to prefer one.