Intro ML Flashcards
(6 cards)
1
Q
Machine Learning, definition
A
A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P if its performance at tasks in T, as measured by P, improves with experience E. Machine learning searches for mappings which generalize.
2
Q
Machine Learning, formulation
A
- Learn: input data x^{(i)} from dataset
- Predict: ouput y^{(i)}, here binary classification (sometimes called inference)
Depends on w (parameter to learn/fit) and dataset D={(x^{(i)}, y^{(i)})^N_{i=1}}.
3
Q
Machine Learning algorithms characterisation
A
- available annotated data (supervised vs. unsupervised)
- Complexity of model (linear vs. non-linear)
- Structure of output (independent vs. structured)
- Modeling of data ((x(i), y(i)) vs. label only (y(i)) (generative vs. discriminative) –generative learn from both input and output
4
Q
Nearest neighbor, definition
A
Dataset: D = {(x(i), y(i))}^N_{i=1}
y=y(k) where
k=arg min_{i∈{1,…,N }} ∥x(i)−x∥^22 = arg min{i∈{1,…,N }} d(x(i),x)
5
Q
Nearest neighbor, shortcomings
A
- Sensitive to outliers → improve using multiple neighbors (k-Nearest neighbor) / majority (weighted)
- Defining distance (categorical / text data)
- Can be slow at prediction (# of datapoints& dimensions)
- Storage can be difficult
6
Q
Nearest neighbor, applications
A
- recommendation systems
- cluster analysis
- spell check