Chapter 1 Flashcards
What is Supervised Learning?
When training data is fed with labels that indicates the solutions (contains y in train)
Name some Supervised Learning Algorithms.
KNN
Linear Regression
Logistic Regression
Support Vector Machines
Décision trees and Random Forests
Neural Networks (sometimes)
What is Unsupervised Learning?
The training data is unlabeled (no y), the class tries to learn without a teacher
Name some Unsupervised Learning Algorithms
Through clustering:
KMeans
DBSCAN
Hierarchical Cluster Analysis
Through Anomaly detection:
One class SVM
Isolation Forest
Visualization and Dimensionality Reduction:
Principal Component Analysis
Kernel PCA
Locally Linear Embedding
T-distributed Stochastic Neighbor Embedding
Association rule learning:
Apriori
Eclat
What is classification?
Examples are with their class in order to classify new emails
What is Regression?
Predicting a target numeric value given a set of features called predictors. Training a model requires both predictors and labels.
What is a clustering algorithm?
An algorithm to detect similarities of data points based on feature combos.
What is hierarchical clustering?
Subdivision of a clustering algorithm into smaller groups
What is a visualization algorithm?
An algorithm that outputs a 2d or 3d representation of data that can be plotted easily.
What is Dimensionality Reduction?
Simplifying data without losing too much data,trying to merge many correlated features into one.
What is feature extraction?
Merging multiple features into one
Should you reduce the dimensions of data before feeding it into a Supervised ML algorithm?
Yes, it will likely perform better and quicker while reducing strain on storage and processing
What is Anomaly detection?
A model that takes in normal data and removes or flags any with a very different result, usually used to remove outliers.
What is novelty detection?
The same as Anomaly detection but they only see normal data , no outliers
What is association rule learning?
Looking through large amounts of data and discover new relations between attributes only possible with enough data.
What is Machine Learning?
The science and art of programming computers so they can learn from data
What is Machine Learning?
The science and art of programming computers so they can learn from data
What is semisupervised learning?
Algorithms that use a lot of unlabeled data to group, then a little labeled data to classify the whole collection
Describe a Deep Belief Network (DBN)
Unsupervised components called Boltzmann Machines (RBMs) stacked on top of each other.thenwjole system is trained unsupervised and then fine tuned using supervised techniques
What is an Agent in Reinforcement learning?
The learning system that can observe the environment, select and perform actions, and get rewards or penalties. It must then learn a policy l
What is a policy in Reinforcement learning?
A policy defines what action the agent should choose when it is in a given situation.
What is batch learning?
Batch learning is when a system cannot learn incrementally and must learn on all available data.
Describe the process of offline Learning?
System is first trained on batch learning, offline and then it is launched into production without learning anymore
For predicting stock prices, which would be better and why: offline learning or online learning?
Online learning, as it is done incrementally, stock data can be trained in small amounts to react quickly to.the change in data