Supervised Learning Flashcards
(37 cards)
What is supervised learning?
A subcategory of M.L. defined by the use of labeled input/output sets.
What is the difference between regression and classification?
Regression is used to predict continuous values such as price or income. The goal is to find a best-fit line. Classification is used to predict a discrete class label, goal: decision boundary.
What kind of problems can you solve with classification and regression?
Regression: weather prediction, housing price prediction
Classification: spam detection, speech recognition, cancer cell identification.
Why is training set error performance unreliable?
Doesn’t generalize to unseen data. Perfect training set performance equals overfitting.
What is machine learning?
A field of artificial intelligence concerned with algorithms that can learn from data.
Two main branches of Machine Learning?
Supervised learning
Unsupervised learning
Two main branches of Machine Learning?
Supervised learning
Unsupervised learning
3 requirements for machine learning?
1) A pattern exists
2) that cannot be pinned down mathematically
3) We have data on it
Define data (for M.L)
Input - correct output pairs (feature, label)
input - real-valued or categorial
output - real-valued (regression) or categorical (classification)
Goal of supervised learning?
To model dependency between features and labels.
Goal of a supervised learning model?
To predict labels for new instances.
What is a training set?
A set of input - output pairs.
Classification output value types?
Categorical or binary (-1,1)
Regression output value type?
Real numbers.
Examples of supervised learning problems?
Junk mail:
features - word frequencies
class - junk/not junk
Access Control System:
features - images
class - ID of the person
Medical diagnosis:
features: BMI, age, symptoms, test results
class: diagnostic code
Formal components of learning.
Input (x) - e.g. customer application
Output (y) - (approval/denial of application)
Target function: x -> y (ideal credit approval formula)
Data {(x1, y1), … (xn, yn)} (historical records)
Hypothesis: g; X -> Y
Hypothesis set (H): group of functions where we look for our solution
Supervised learning uses test data to learn this function from H that can be applied to new data.
Building blocks for an M.L. algorithm?
Model class (hypothesis set) e.g.
-linear or quadratic function
-decision tree
-neural network, clustering
Error measure (Score function)
Algorithm - good model defined by the score function
Validation
Dangers of overfitting
The model memorizes training data and does not generalize beyond it.
100% accuracy on training data, can’t do better than random guessing on new instances.
Dangers of underfitting
Model not expressive enough, for ex. linear functions on non-linear problems.
Approximation-Generalization tradeoff
Goal: to approximate target function as closely as possible.
More complex hypothesis set: better chance of approximating target function f
Less complex hypothesis set: better chance of generalizing f outside of the training set
Ideal hypothesis set H
H = {f}, we already know the target function, no need for M.L.
Occam’s Razor
the principle that favors the simplest hypothesis (set) that can well explain a given set of observations.
Criteria for a good model
Interpretability
Computational complexity
How to control Hypothesis set complexity?
With hyperparameters.
-max degree of polynomials
-no of nearest neighbors
-regularization parameter
-depth of decision tree