Intro ML Flashcards

Question 1

Q

Machine Learning, definition

Answer

A

A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P if its performance at tasks in T, as measured by P, improves with experience E. Machine learning searches for mappings which generalize.

Question 2

Q

Machine Learning, formulation

Answer

A

Learn: input data x^{(i)} from dataset
Predict: ouput y^{(i)}, here binary classification (sometimes called inference)
Depends on w (parameter to learn/fit) and dataset D={(x^{(i)}, y^{(i)})^N_{i=1}}.

Question 3

Q

Machine Learning algorithms characterisation

Answer

A

available annotated data (supervised vs. unsupervised)
Complexity of model (linear vs. non-linear)
Structure of output (independent vs. structured)
Modeling of data ((x(i), y(i)) vs. label only (y(i)) (generative vs. discriminative) –generative learn from both input and output

Question 4

Q

Nearest neighbor, definition

Answer

A

Dataset: D = {(x(i), y(i))}^N_{i=1}
y=y(k) where
k=arg min_{i∈{1,…,N }} ∥x(i)−x∥^22 = arg min{i∈{1,…,N }} d(x(i),x)

Question 5

Q

Nearest neighbor, shortcomings

Answer

A

Sensitive to outliers → improve using multiple neighbors (k-Nearest neighbor) / majority (weighted)
Defining distance (categorical / text data)
Can be slow at prediction (# of datapoints& dimensions)
Storage can be difficult

Question 6

Q

Nearest neighbor, applications

Answer

A

recommendation systems
cluster analysis
spell check

Intro ML Flashcards

(6 cards)