Intro ML Flashcards

(6 cards)

1
Q

Machine Learning, definition

A

A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P if its performance at tasks in T, as measured by P, improves with experience E. Machine learning searches for mappings which generalize.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Machine Learning, formulation

A
  • Learn: input data x^{(i)} from dataset
  • Predict: ouput y^{(i)}, here binary classification (sometimes called inference)
    Depends on w (parameter to learn/fit) and dataset D={(x^{(i)}, y^{(i)})^N_{i=1}}.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Machine Learning algorithms characterisation

A
  • available annotated data (supervised vs. unsupervised)
  • Complexity of model (linear vs. non-linear)
  • Structure of output (independent vs. structured)
  • Modeling of data ((x(i), y(i)) vs. label only (y(i)) (generative vs. discriminative) –generative learn from both input and output
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Nearest neighbor, definition

A

Dataset: D = {(x(i), y(i))}^N_{i=1}
y=y(k) where
k=arg min_{i∈{1,…,N }} ∥x(i)−x∥^22 = arg min{i∈{1,…,N }} d(x(i),x)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Nearest neighbor, shortcomings

A
  • Sensitive to outliers → improve using multiple neighbors (k-Nearest neighbor) / majority (weighted)
  • Defining distance (categorical / text data)
  • Can be slow at prediction (# of datapoints& dimensions)
  • Storage can be difficult
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Nearest neighbor, applications

A
  • recommendation systems
  • cluster analysis
  • spell check
How well did you know this?
1
Not at all
2
3
4
5
Perfectly