Dimension reduction
Reducing the number of variables under consideration. In many applications, the raw data have very high dimensional features and some features are redundant or irrelevant to the task. Reducing the dimensionality helps to find the true, latent relationship.
Supervised learning
Supervised learning algorithms make predictions based on a set of examples.
PCA
An unsupervised clustering method which maps the original data space into a lower dimensional space while preserving as much information as possible. The PCA basically finds a subspace that most preserves the data variance, with the subspace defined by the dominant eigenvectors of the data’s covariance matrix.
CheatSheet

Linear SVM and kernel SVM
When the classes are not linearly separable, a kernel trick can be used to map a non-linearly separable space into a higher dimension linearly separable space.
When most dependent variables are numeric, logistic regression and SVM should be the first try for classification.

Unsupervised: Clustering

Factors to consider in ML algorithm
Supervised: Classification

SVD
Classification
When the data are being used to predict a categorical variable
DBSCAN
When the number of clusters k is not given, DBSCAN (density-based spatial clustering) can be used by connecting samples through density diffusion.

Regression
When predicting continuous values
Hierarchical result
use hierarchical clustering
Semi-supervised learning
Use unlabeled examples with a small amount of labeled data to improve the learning accuracy.
When trying to solve a new ML problem what are the three steps?
Why we need PCA, SVD and LDA
We generally do not want to feed a large number of features directly into a machine learning algorithm since some features may be irrelevant or the “intrinsic” dimensionality may be smaller than the number of features
Supervised: Regression

Neural networks and deep learning
What are [1], [2], [3], and [4]?

[1] Unsupervised: Dimensionality Reduction)
[2] Unsupervised: Clustering
[3] Supervised: Regression
[4] Supervised: Classification
Considerations when choosing an algorithm
What are PCA, SVD and LDA
Principal component analysis (PCA)
Singular value decomposition (SVD)
Latent Dirichlet allocation (LDA)
Hierarchical clustering
Hierarchical partitions can be visualized using a tree structure (a dendrogram). It does not need the number of clusters as an input and the partitions can be viewed at different levels of granularities (i.e., can refine/coarsen clusters) using different K.

Perform dimension reduction
Principal component analysis
k-means, k-modes, and GMM (Gaussian mixture model) clustering