Week 9: Ensemble Methods Flashcards

1
Q

Ensemble Methods

A

A class of methods where a collection of weak classifiers is generated for the dataset, with each weak classifier being a simple classifier. The classifiers are combined together to form the final classifier, by averaging (can be weighted) or voting.

The structures of ensemble classifiers can be parallel, serial, and hierarchical.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Bagging

A

It’s short for Bootstrap Aggregating. This method uses multiple versions of the training set. With each draw, a different subset of the training set is drawn to train a different weak classifier. The final classification decision is based on votes from each weak classifier.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Boosting

A

The method begins by making a classifier with a greater-than-average accuracy, and then adding new component classifiers to form an ensemble whose joint decision rule as arbitrarily high accuracy on the training set.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Adaptive Boost (AdaBoost)

A

This boosting method adds weak classifiers to the final classifier until a desirable classification accuracy is reached. Each added weak classifier is assigned a weight.

Pros:
- Simple to implement
- Good classification accuracy and generalisation
- Can be applied to different classifiers

Cons
- Solution is suboptimal
- Sensitive to noisy data and outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Bootstrapping

A

When data is continuously sampled with replacement to simulate the model viewing unseen data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Stacked Generalisation

A

In this method, overall classification is done in 2 levels. The input is processed by group classifiers in the 1st level. Output from the 1st layer will go through the 2nd level classifier to make a final decision.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Random Forest

A

This method uses an ensemble of decision trees. The final decision is made by combining decisions from all decision trees. Bagging is used on the training data for each tree.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly