Classifiers Flashcards
(32 cards)
What is the difference between classification and regression?
Classification aims to split discrete data into categories, whereas regression aims to model continuous data.
What type of learning is classification?
Supervised
What technique was used in the first classifiers?
Logistics Regression with x as continuous data and y as binary data
How do you fit a logistics regression using sklearn?
log_reg.fit(x_train, y_train)
How do you find the score of a logistics regression using sklearn?
log_reg.score(x_test, y_test)
How is score calculated for logistics regression?
mean accuracy on given test data and labels
What is a perceptron?
A perceptron is a function that aims to draw a line or plane to separate two categories of data.
What does a perceptron aim to learn?
It aims to learn the weights that allow it to best classify data. This is the same as learning the coefficient of the line.
How is a perceptron trained?
Weights initialised as 0
Cycle through the data
for each x tray classifying it: y = f(wx + b)
update w: w = w + α(y - y)x
If the prediction is correct w is not updated
What is a perceptrons main weakness?
They rely on data having linear separability. If a straight line can’t be drawn to separate the data, then a perceptron won’t work.
What is a Multi-Layer Perceptron?
The simplest type of Neural Network that contains hidden layers made up of a number of perceptrons.
What is the advantage of an MLP?
MLP’s don’t require the data to be linearly separable so are more powerful.
What technique is used to fit an ML model?
Gradient Descent
What are the two types of parameter fitting method?
Deterministic and Stochastic
How is error claculated?
Error is calculated using either L1 or L2 norm.
What is L1 Norm?
The sum of the absolute errors
What is L2 Norm?
The root of the sum of squared errors
What are other names for the error?
The cost function or loss function
How does gradient descent work?
Start at random point p1
calculate loss at p1
take a step in the direction where the loss has the steepest gradient
calculate loss at new point
repeat until the loss gradient is less than a threshold or until N steps
What is a confusion matrix?
A representation of the predicated values and if they are true/false positive or true/false negatives.
How is precision calculated?
True Positive/ True Positive + False Positive
How does high precision present?
an example labelled as positive is likely to be positive (small number of false positives)
How is recall calculated?
TP/TP+FN
How does recall present?
a class is correctly recognised so there are a small number of false negatives