Ensemble Methods - Other Modeling Aspects Flashcards

1
Q

Binary Target

A

When there is a disproportionate number of observations in one of the classes, a classification model will likely struggle to make proper predictions for the minority class. Thus, in light of sensitivity and specificity, such models tend to have a high value for one of the metrics and a low value for the other one.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Oversampling

A

Sampling technique for duplicating observations in the minority class to reduce the imbalance.

Only apply to training set

possible to overfit the minority class since duplicating observations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Undersampling

A

Sampling technique for detecting observations in the majority class to reduce the imbalance.

Only apply to training set

possible to underfit the majority class since observations are removed -> can lose useful info with a smaller dataset

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Oversampling and Undersampling for cv

A

k-fold cv -> divide the training ste into k groups.

perform over/under on all k training sets before fitting the model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly