Basics and Definitions Flashcards
What is Machine Learning? What does learning mean in this context?
An ML system can learn from data. Where learning is defined as improvement based on a performance measure.
When is Machine Learning typically used?
- Replacement for a large number of hard-coded rules.
- Problems with no algorithmic solution
- Rapidly adapting systems
- To analyze data
What are the main challenges faced when applying Machine Learning?
- Features not relevant to the ML task at hand
- Training examples are too similar to each other
- Not enough training data
- Training data has too many errors
What does overfitting mean?
The model follows the training data closely, but performs poorly on the remaining data
What does underfitting mean?
The model performs poorly on the training data, and every other dataset.
What is a data mismatch?
When a large part of the data does not match the data the model will encounter when deployed.
What are the main steps in training a model?
- Training
- Testing
- Validation
What is the difference between supervised and unsupervised learning?
Supervised learning uses a labelled dataset, unsupervised does not.
What is semi-supervised learning?
Where some examples are labeled and others aren’t.
What is reinforcement learning?
Learning where a model learns to respond optimally to a given state.
What is the difference between online and batch learning?
Online learning learns by being given one example, or a minibatch of examples, at a time, batch learning utilizes all available data in the training process.
What is the difference between instance-based and model-based learning?
Instance based learning retains the examples in memory and compares them to new examples to operate. Where model-based learning works to optimize a set of model parameters.
What is out of core learning?
Out of core learning uses online learning to allow a batch learning to learn a training set that is too large to fit in memory.
What is the training set used for?
Setting model parameters.
What is the validation set used for?
Model and hyper-parameter selection.