Supervised Learning Flashcards

Question 1

Q

What is a Support Vector Machine (SVM)?

Answer

A

A supervised machine learning model that finds a line that separates data points by some form of margin

Question 2

Q

What types of problems are Support Vector Machine (SVM) used for?

Answer

A

Classification, regression and clustering

Question 3

Q

What is a margin in an SVM?

Answer

A

The distance separating the closest pair of data points belonging to opposite classes

Question 4

Q

How can we train an Support Vector Machine (SVM)?

Answer

A

We want to optimise the model such that the margin is maximised

Question 5

Q

The two data points that are the closest ones belonging to opposite classes are called the support vectors, as they define the…

Answer

A

Decision boundary

Question 6

Q

How do we deal with outliers in Support Vector Machines (SVMs)?

Answer

A

By intentionally allowing misclassifications, letting outliers be classified incorrectly

Question 7

Q

If the training data contains outliers, the maximum margin classifier would be…

Answer

A

Closer to the green observations than from the red

Question 8

Q

If the data is not linearly separable for a Support Vector Machine (SVM), then we may…

Answer

A

Increase the dimensionality of the data via some transformation to make the classes linearly separable

Question 9

Q

What is the kernel trick in an Support Vector Machine (SVM)?

Answer

A

We represent the dataset as an nxn kernel matrix of pairwise similarity comparisons, with each entry defined as the dot product of the vectors in the feature space

Question 10

Q

Why do we need the kernel trick in Support Vector Machines (SVMs)?

Answer

A

Transforming the data to higher dimensions has a high computation cost, therefore the kernel trick lets us calculate the relationships in the data without actually transforming it

Question 11

Q

What is a decision tree?

Answer

A

A supervised learning algorithm that uses a sequential model of decisions and their possible consequences to produce predictions

Question 12

Q

How do decision trees produce predictions?

Answer

A

We follow the decision tree’s logic until we reach a leaf, which tells us the value, sort of like a flow chart

Question 13

Q

What is a classification tree?

Answer

A

A variant of a decision tree that classifies observations into categories based on multiple input values

Question 14

Q

What is a regression tree?

Answer

A

A variant of a decision tree that is used to predict numeric values - though, this is still, really, a classification, as we can only predict the bounds of that value

Question 15

Q

How do classification trees produce predictions?

Answer

A

Start at the top, and work your way down until you get to one of the tree’s leaves - typically, if a statement is true, you go left, and if it’s false, you go right

Question 16

Q

How can we build a classification tree?

Answer

Study These Flashcards

A

Choose which question to ask at the top of the tree, then continue this process for the other columns

Question 17

Q

What is impurity in a decision?

Answer

Study These Flashcards

A

A measure of how mixed the data is regarding its outcome

Question 18

Q

What is Gini impurity?

Answer

Study These Flashcards

A

A metric for quantifying impurity as 1 minus the sum of the probability of the outcome squared

Question 19

Q

How do we adjust Gini impurity for non-categorical data?

Answer

Study These Flashcards

A

Sort the observations in that column, then calculate the average age for all adjacent rows, calculating Gini impurity for each average

Question 20

Q

What issues can we face when dealing with very small amounts of data, or class imbalance?

Answer

Study These Flashcards

A

Bias or overfitting

Question 21

Q

What is the pruning of a classification tree?

Answer

Study These Flashcards

A

A data compression technique that reduces the size of a decision tree by removing non-critical or redundant sections, reducing complexity and thereby improving generalised accuracy

Question 22

Q

What is pre-pruning of a classification tree?

Answer

Study These Flashcards

A

Pruning while training. One method is specifying the minimum number of samples that must be present in each node, and another is maximum depth

Question 23

Q

What is post-pruning of a classification tree?

Answer

Study These Flashcards

A

Pruning after training. We start at the lowest branch, and consider the error of the whole tree, and the error of the tree minus the lowest branch. Repeat until the error of the whole tree is less than the lowest branch

Question 24

Q

Answer

Study These Flashcards

A

Supervised Learning Flashcards

(24 cards)