Machine Learning Flashcards

1
Q

What is a feature vector?

A

Features chosen from the data for ‘a particular task at hand’.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are 3 areas of ML?

A

Classification: Created named groups
Clustering: Simply create differentiation
Regression: Draw a slope/line for direction

First 2 look similar but technology is quite different.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are some of the classifiers?

A

k-nearest neighbor: Buckets upto k items closest to each other in the space. (N-dimensional if N features)

Support vector machine (SVM): Draws an imaginary line to split points

Decision tree: e.g. Handwriting recognition by hollows and circles is amazing. (Boosted decision trees introduce new trees as data comes)

Logistic regression: When less features, just create x(Feature one) + y(Feature two), precision not very important.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is Supervised and Unsupervised learning?

A

Teach computer how to do something vs let them figure out something out of data.

In supervised, right answers are given. Classification and Regression belong to this. In Unsupervised, try to make sense of data and Clustering belongs to this (e.g. Social network analysis, market segmentation, news clustering)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is cost function in Linear Regression?

A

In Linear Regression we want to plot a straight line to depict the slope. Say y = Theta0 + Theta1(x)
Cost Function is a function that needs to be ‘minimized’ to best estimate Theta0 and Theta1 and plot the line.

Squared function is popular: 1/2m * Sum(square(estimate - actual)) … plot this for various Theta values and find minimum on curve.

Gradient descent algorithm for linear regression looks for next value of Thetas to reduce the cost function value and thereby reach to minimum spot.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly