ML-01 - ML-01-Introduction and linear regression Flashcards
ML-01 - Introduction and linear regression
When did Arthur Samuel come up with his definition of machine learning?
The 1950s.
ML-01 - Introduction and linear regression
What as Arthur Samuel’s definition of machine learning?
“[…] the field of study that gives computers the ability to learn without being explicitly learned/programmed.”
ML-01 - Introduction and linear regression
How did Tom Mitchell define machine learning?
Machine learning is a field of study which enables a computer program
learn from experience 𝑬 with respect to some task 𝑻 in a well-posed problem
and some performance measure 𝑷, and improves the performance 𝑷 with experience 𝑬.
ML-01 - Introduction and linear regression
What are the 3 broad types of machine learning?
- Supervised learning
- Unsupervised learning
- Reinforcement learning
ML-01 - Introduction and linear regression
What are the two big types of supervised learning?
- Classification
- Regression
ML-01 - Introduction and linear regression
What are the two big types of unsupervised learning?
- Clustering
- Dimensionality reduction
ML-01 - Introduction and linear regression
Describe the difference between regression and classification.
Regression predicts continuous values, while classification predicts discrete categories.
ML-01 - Introduction and linear regression
What is Semi-supervised learning?
A type of ML approach where you have some labeled data, but lots of unlabeled data.
ML-01 - Introduction and linear regression
What is reinforcement learning?
Learning by interacting with the environment.
ML-01 - Introduction and linear regression
What are the 5 steps for a supervised learning workflow?
1) Get data
2) Clean, prepare, manipulate
3) Train the model
4) Test data
5) Improve
ML-01 - Introduction and linear regression
What are the two most common optimization methods?
- Iterative methods, like gradient descent.
- Non-iterative methods, like the least squares method.
ML-01 - Introduction and linear regression
Describe gradient descent.
Gradient descent works by following the gradient of a function to reach a minimum.
ML-01 - Introduction and linear regression
What is the formula for gradient descent?
(See image)
ML-01 - Introduction and linear regression
What are the 3 typical variants of gradient descent?
- (Batch) gradient descent
- Mini-batch gradient descent
- Stochastic gradient descent
ML-01 - Introduction and linear regression
Describe (batch) gradient descent.
use the entire training samples in each iteration (called epoch) of gradient descent.
ML-01 - Introduction and linear regression
Describe mini-batch gradient descent.
Instead of learning on all data per epoch, learning happens on subsets of the data. During each epoch, N / batch_size samples are selected without replacement. For each batch, the model is updated.
ML-01 - Introduction and linear regression
Describe stochastic gradient descent.
During each epochs, set the batch size to 1 and update on each training example.
ML-01 - Introduction and linear regression
How do you make sure you set the learning rate correctly?
Plot loss vs. the number of epochs and make sure the loss converges after some number of iterations.
ML-01 - Introduction and linear regression
What happens if the learning rate is too high?
Loss might not decrease on every iteration and the training won’t converge.
ML-01 - Introduction and linear regression
What happens if the learning rate is too low?
The learning takes a long time to converge.
ML-01 - Introduction and linear regression
Describe the training workflow steps.
(See image)
ML-01 - Introduction and linear regression
What is feature scaling?
A transformation of some data to minimize the effects of different scales.
Learning rates are sensitive to unnormalized data.
E.g. house prices are a lot higher than the number of square meters, and the corresponding coefficients might be disproportional.
ML-01 - Introduction and linear regression
Describe visually what happens in feature scaling.
(See image)
The unnormalized data makes the learning curve jump around and it doesn’t converge nicely.
ML-01 - Introduction and linear regression
What are the two most commonly used normalization methods?
- Min-max normalization
- Standardization (z-score)