Classifier Models Flashcards

Question 1

Q

What are classification models?

Answer

A

Classification models are used to classify observations by some discrete label (i.e. gender, revenue bounds)

Question 2

Q

How do classification models create predictions?

Answer

A

Separating observations using a plane or cluster in order to group them by label

Question 3

Q

Is classification an unsupervised or supervised learning task?

Answer

A

Supervised

Question 4

Q

Logistic regression is useful when we have [continuous/discrete/binary] observations and [continuous/discrete/binary] labels.

Answer

A

Continuous, binary

Question 5

Q

Logistic regression graphs continuous observations onto a graph with only […] label values.

Answer

A

Two/binary

Question 6

Q

By adjusting the parameters of a logistic regression, we adjust…

Answer

A

The steepness of the ‘middle curve’ in our sigmoid-like function

Question 7

Q

The accuracy of a classifier is calculated as…

Answer

A

The number of true positives and negatives divided by the total number of predictions

Question 8

Q

How do perceptrons classify data?

Answer

A

By drawing a hyperplane between the binary data, separating them

Question 9

Q

What is a superplane?

Answer

A

A hyperplane is a mathematical line that attempts to separate two classes of data

Question 10

Q

How do perceptrons learn how to properly classify data?

Answer

A

By taking input signals and passing them through a layer that applies a weight to each feature of the input, finishing with a function that converts that output into a binary value like softmax

Question 11

Q

What does a perceptron’s training phase look like?

Answer

A

Initialise each weight at 0, and cycle through the data. For each x, try classifying it, then update each weight according to some update rule ONLY if it is correct

Question 12

Q

What is the problem with a basic perceptron?

Answer

A

They have linear separability, meaning they can only separate the data with one line, making more complex shapes difficult

Question 13

Q

What is a multilayer perceptron?

Answer

A

An adaptation of a basic perceptron that uses multiple weighted layers to simulate multiple layers of neurons, called hidden layers

Question 14

Q

What are the layers between the input and output of a multilayer perceptron called?

Answer

A

Hidden layers

Question 15

Q

What is K-Nearest Neighbours, and what type of problem is it used for?

Answer

A

A supervised learning algorithm that models similarity via distance, and it is used for classification problems

Question 16

Q

How does K-Nearest Neighbours facilitate predictions?

Answer

Study These Flashcards

A

Splitting the dataset into clusters, each of which represent a different class cluster, predicting new values as the cluster it is closest to

Question 17

Q

How does K-Nearest Neighbours make a prediction given one new data point?

Answer

Study These Flashcards

A

Calculate the distance to the other points, then sort that list and select the K nearest points. Find the majority class among those 3 neighbours to find the new point’s class

Question 18

Q

How can outliers affect K-Nearest Neighbours?

Answer

Study These Flashcards

A

Since we predict using a distance metric, outliers can skew how often we predict a certain class given its position on the graph

Question 19

Q

How can class imbalance affect K-Nearest Neighbours?

Answer

Study These Flashcards

A

Our predictions are based on a number of nearest neighbours, so if we simply have more neighbours than the other class, we are more likely to select the other, even if the nearest one is more accurate

Question 20

Q

How can we select the optimal parameters for K-Nearest Neighbours?

Answer

Study These Flashcards

A

Starting with one and increasing k by 1 each time, or setting k to the square root of the number of data points in the training dataset

Question 21

Q

What is Weighted KNN?

Answer

Study These Flashcards

A

A variant of KNN where we make the assumption that the impact of nearer neighbours should be greater than the further neighbours, using distance to consider impact

Question 22

Q

Why does complexity increase with the size of the training data?

Answer

Study These Flashcards

A

KNN is a form of instance-based learning, meaning they construct hypotheses directly from training instances, therefore as the data gets larger, we encounter slower training - O(n)

Question 23

Q

KNN is most suited to [higher/lower] dimensional data.

Answer

Study These Flashcards

A

Lower

Classifier Models Flashcards

(23 cards)