Classification Flashcards

1
Q

classification

A

unlike regression that tries to predict a continuous variable. Classification predicts a categorical variable.
Logistic Regression
K-NN
Tree based ( Decision Trees, bagging (random forests) and boosting (XGBoost))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

confusion matrix

A
to evaluate classification method
 a table where each column of the matrix represents the number of samples in each predicted class and each row represents the number of samples in each actual class
then can copute TP & FP
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Logistic regression

A

Y can be 0 or 1
𝛽0 shifts the curve right or left and 𝛽1 controls how steep the 𝑆-shaped curve is.
𝛽1 changes the log odds

it is a linear classifier, performs best if the decision boundary between classes is a line. If not a line can eg consider KNN.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

KNN

A

IT a non linear classifier (unlike logisitic regression)

In k-NN classification, the output is a class membership. An object is classified by a plurality vote of its neighbors, with the object being assigned to the class most common among its k nearest neighbors (k is a positive integer, typically small). If k = 1, then the object is simply assigned to the class of that single nearest neighbor.

small K: low bias but high variance (more flexible)
big k: higher bias but lower variance (more smooth)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

ROC CURVE

A

BEST UPPER LEFT CORNER (best true positive rate and false positive rate), each point is a treshold we try. If we get a straight line then it is the worst performance.

Shows the change in performance with different threshold

(we applied with threshold logistic regression to choose one class or the other)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Decision trees

A

Decision trees is a non-linear classifier, partion space into regions
Find predictors and treshold that reduce the mean error (error can be entropy, information gain can tell us how useful adding one feature can be). Loop with reduced input each time.
Can overfit but we can limit the depth of the tree and the amount of min sample a node must have to be split (such that the subgroups the decision tree creates arent too specific)
Advantages: very interpretable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Bagging tree

A

Train multiple trees on subset of the data.
Use all trees to make a predict, output the label that has been output by most trees.
Trees can have a large depth because the average will reduce the variance and higher the bias. If there is a strongly correlated predictor tho, it might not be sufficient as all trees will have that same predictor …

Disadvantage is because of the average, the output is no longer interpretable.

Feature importance, see how a predictor affect the RSS of each tree and average it.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Random Forest

A

Modified bagging tree
More random and thus lower variance more efficiently

Train n models and each choose predictors randomly thus cant have the problem of all choosing the same

Feature importance, see how a predictor affect the RSS of each tree and average it.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Boosted tree

A

Do a simple tree (eg: use only 1 predictor), see what it doesn’t predict well, do another simple tree to predict that, loop until satisfied then combine all trees together to make the overall prediction
works usually better than Random Forest
Shouldnt do too many trees otherwise can overfit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Entropy

A

Measure uncertainty (if there is 50/50 chance of an outcome than the situation is as uncertain as it could be). Used to score the error in classification
entropy goes from 0 to 1
1 is the most uncertain (thus we would like the value to be as low as possible)

information gain can tell us how useful adding one feature can be, the higher the better

How well did you know this?
1
Not at all
2
3
4
5
Perfectly