The Machine Learning Landscape Flashcards by Tessa Keller

What is a definition for “Machine Learning”?

Machine Learning is the field of study that gives computers the ability to learn how to solve a specific problem, without being explicitly programmed.

How well did you know this?

Not at all

Perfectly

What is the definition of “Training Set” and “Test Set”?

The Training Set is the examples/instances that the system uses to learn from.
The Test Set is the instances that you uses to test the system on what he learnt from the training set.

How well did you know this?

Not at all

Perfectly

What is the definition of a “Model”?

The part of the machine learning system that learns and makes predictions.
Examples: Neural Networks, Random Forest.

How well did you know this?

Not at all

Perfectly

In which cases Machine Learning is a great solution?

Problems for which existing solutions require of lot of fine-tuning or complex rules;
Complex problems for which using a traditional approach yields no good solution;
Fluctuating environments (a ML system can easily be re-trained on new data);
Getting insights about complex problems and large amounts of data.

How well did you know this?

Not at all

Perfectly

What are the main criteria used to classify Machine Learning systems?

Training Supervision (supervised, unsupervised, semi-supervised, self-supervised…);
Incremental/Online or Batch Learning;
Learning Approach (instance-based or model-based).

How well did you know this?

Not at all

Perfectly

What is the definition of “Supervised Learning”?

The model is trained using instances with their features (called predictors or attributes) and the desired solutions (called labels).
Examples: classification, regression: predict a target numeric value.

How well did you know this?

Not at all

Perfectly

What is the definition of “Unsupervised Learning”?

The model is trained without any solutions (unlabeled).
Examples: clustering, anomaly detection, association rule learning, dimensionality reduction.

How well did you know this?

Not at all

Perfectly

What is the definition of “Clustering”?

The task of grouping a set of objects in such a way that objects in the same group (called cluster) are more similar (in some way) to each other than to those in other groups (clusters).

How well did you know this?

Not at all

Perfectly

What is the definition of “Dimensionality Reduction”?

The task of reducing the number of features in a dataset while retaining as much information as possible. It is a process of transforming high-dimensional data into a lower-dimensional space that still preserves the essence of the original data.

How well did you know this?

Not at all

Perfectly

What is the definition of “Anomaly Detection”?

The task of identifying rare instances or observations which can raise suspicions by being statistically different from the rest of the observations.
Examples: credit card fraud, manufacturing defaults.

How well did you know this?

Not at all

Perfectly

What is the definition of “Self-Supervised Learning”?

The model trains itself to learn one part of the input from another part of the input: it generates a fully labeled dataset from a fully unlabeled one.

How well did you know this?

Not at all

Perfectly

What is the definition of “Reinforcement Learning”?

The learning system, called an agent, can observe the environment, select and perform actions, and get rewards in return (or penalties as negative rewards). It must then learn by itself what is the best strategy, called a policy, to get the most reward over time.

How well did you know this?

Not at all

Perfectly

What is the definition of “Batch/Offline Learning”?

The system is incapable of learning incrementally: it must be trained using all the available data. First the system is trained, and then it is launched into production and runs without learning anymore; it just applies what it has learned.

How well did you know this?

Not at all

Perfectly

What is the definition of “Association Rule Learning”?

The task of detecting dependency of one data item on another data item and maps accordingly so that it can be more profitable. It tries to find some interesting relations or associations among the variables of dataset.

How well did you know this?

Not at all

Perfectly

What is the definition of “Semi-Supervised Learning”?

The model is train with partially labeled instances. Most of these models are combinations of unsupervised and supervised algorithms. This type of learning is interesting as labeling is usually time-consuming and costly.

How well did you know this?

Not at all

Perfectly

What is the definition of “Model Rot” or “Data Drift”?

The phenomenon by which a offline model’s performance tends to decay slowly over time because the world continues to evolve while the model remains unchanged.

How well did you know this?

Not at all

Perfectly

What is the definition of “Incremental/Online Learning”?

The system is trained incrementally by feeding it data instances sequentially, either individually or in small groups called mini-batches.

How well did you know this?

Not at all

Perfectly

What is the “Learning Rate”? And what is the difference between a high and low one?

The speed at which the online learning model adapt to new and changing data:
- High: rapidly adapt to new data BUT tends to quickly forget the old data;
- Low: learn more slowly BUT less sensitive to noise and outliers.

How well did you know this?

Not at all

Perfectly

What is the definition of “Instance-based Learning”?

Study These Flashcards

One of the 2 main approaches to generalization (how to react in front of new instances).

The system learns the examples by heart, then generalizes to new cases by using a similarity measure to compare them to the learned examples (or a subset of them).

What is the definition of “Model-based Learning”?

Study These Flashcards

One of the 2 main approaches to generalization (how to react in front of new instances).

The system builds a model (like a linear regression) through the examples and then use that model to make predictions.

What is the “Performance Measure”?

Study These Flashcards

The metric used to determine how good or bad the model is performing.

What is the difference between a “Utility Function” and a “Cost Function”?

Study These Flashcards

A “Utility Function” determines how GOOD the model is while a “Cost Function” determines how BAD it is.

What type of algorithm would you use to allow a robot to walk in various
unknown terrains?

Study These Flashcards

The best Machine Learning algorithm to allow a robot to walk in unknown terrain is Reinforced Learning, where the robot can learn from response of the terrain to optimize itself.

What type of algorithm would you use to segment your customers into multiple
groups?

Study These Flashcards

The best algorithm to segment customers into multiple groups is either supervised learning (if the groups have known labels) or unsupervised learning (if there are no group labels).

Would you frame the problem of spam detection as a supervised learning problem or an unsupervised learning problem?

Spam detection is a supervised learning problem because the labels are known (spam or no spam).

What is "Transfer Learning"?

The process of transferring knowledge learned from one task to another in order to boost its performance. Examples: if the model knows how to differentiate animal species through pictures, the next step would be to classify them.

What is a "Policy" in reinforcement learning?

The strategy or action the agent should choose when it is in a given situation in order to get the most reward.

What is the main risk with "Batch Learning"?

The main risk is "Model Rot" or "Data Drift".

When is it difficult of impossible to use a "Batch Learning" model?

1. The amount of data is huge; 2. The data is fast-evolving.

When is it difficult of impossible to use a "Batch Learning" model?

1. The amount of data is huge; 2. The data is fast-evolving; 3. The computing resources are limited.

What is the definition of "Out-of-Core Learning"?

A system that can handle data that cannot fit into your computer memory. It uses online learning system to feed data in small bits.

When is it difficult of impossible to use a "Batch Learning" model?

1. The amount of data is huge; 2. The data is fast-evolving; 3. The computing resources are limited.

What type of learning algorithm relies on a similarity measure to make predictions?

Learning algorithm that relies on a similarity measure to make predictions is instance-based algorithm.

What are the main "bad data" issue categories?

1. Insufficient quantity of training data 2. Non-representative training data 3. Poor-quality data (errors, noise, outliers) 4. Irrelevant features/attributes

What are the main "bad model" issue categories?

1. Overfitting the training data 2. Underfitting the training data

What is "Sampling Bias"?

A bias where certain groups of individuals are more likely to be included in a sample than others, leading to an unrepresentative sample.

How to avoid using irrelevant features in a model?

Through using "Feature Engineering": 1. Feature selection (selecting the most useful features among existing ones); 2. Feature extraction (combining existing features to produce a more useful one); 3. Creating new features by gathering new data.

What is "Overfitting" and when does it happen?

It means that the model performs well on the training data, but it does not generalize well. Overfitting happens when the model is too complex relative to the amount and noisiness of the training data

How to avoid overfitting?

1. Simplify the model by: - selecting one with fewer parameters, - reducing the number of attributes in the training data, - by constraining the model; 2. Gather more training data; 3. Reduce the noise in the training data (e.g., fix data errors and remove outliers).

What is "Regularization"?

Constraining a model to make it simpler and reduce the risk of overfitting

What is a "Hyper-Parameter"?

The amount of regularization to apply during learning can be controlled by a hyper‐ parameter. A hyperparameter is a parameter of a learning algorithm (not of the model). As such, it is not affected by the learning algorithm itself; it must be set prior to training and remains constant during training.

What is "Underfitting" and when does it happen?

It means that the model is too simple to learn the underlying structure of the data. For example, a linear model of life satisfaction is prone to underfit; reality is just more complex than the model, so its predictions are bound to be inaccurate, even on the training examples.

How to avoid underfitting?

1. Select a more powerful model, with more parameters; 2. Feed better features to the learning algorithm (feature engineering); 3. Reduce the constraints on the model (for example by reducing the regularization hyperparameter).

What do model-based learning algorithms search for? What is the most common strategy they use to succeed? How do they make predictions?

Model based learning algorithm search for the optimal value of parameters in a model that will give the best results for the new instances. We often use a cost function or similar to determine what the parameter value has to be in order to minimize the function. The model makes prediction by using the value of the new instance and the parameters in its function.

If your model performs great on the training data but generalizes poorly to new instances, what is happening?

If the model performs poorly to new instances, then it has overfit on the training data.

The Machine Learning Landscape Flashcards

(45 cards)