Chapter 30 Boosting and AdaBoost Flashcards

1
Q

HOW DOES BOOSTING ENSEMBLE WORK? P146

A

It creates a strong classifier from a number of weak classifiers. This is done by:
1- Building a model from the training data
2- Creating a second model that attempts to correct the errors from the first model
3- Models are added till the training set is predicted perfectly or a maximum number of models are added.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

WHAT IS ADABOOST BEST USED FOR? P146

A

It’s best used to boost the performance of decision trees on binary classification problems.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

HOW DEEP ARE THE DECISION TREES USED IN ADA BOOST? P147

A

They are short and only contain one decision for classification, they are called stumps.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

HOW DOES ADABOOST INITIALLY SET WEIGHT FOR THE TRAINING INSTANCES? P147

A

Initially they are all weighed 1/n , n is the number of the training instances.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

HOW DOES ADABOOST ENSEMBLE WORK? P147

A

Weak models are added sequentially, trained using the weighted training data. The process continues until a pre-set number of weak learners have been created (a user parameter) or no further improvement can be made on the training dataset.

Ref

  • The amount of say for each stump, is directly dependent on the error it makes when classifying the sample from the training set
  • Updating weights of instances is done using the collective performance of the stumps made so far, on the whole training set
    • so for example if we have 2 stumps so far, the weights of instances are updated according to how the two stumps perform on the whole training set, they may still misclassify the previously misclassified instances, so the weight for those instances is going to be bigger, or they may correctly classify some of the previously misclassified instances and misclassify some new ones, in this case the new ones will have more weight and the correct ones will have less weight.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

HOW DOES ADABOOST PREDICT A NEW INPUT’S TARGET VALUE? P148

A

For a new input instance, each weak learner calculates a predicted value as either +1.0 or -1.0. The predicted values are weighted by each weak learner’s stage value. The prediction for the ensemble model is taken as the sum of the weighted predictions. If the sum is positive, then the first class is predicted, if negative the second class is predicted

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

HOW CAN WE PREPARE DATA FOR ADABOOST? P148

A

Quality Data: Because the ensemble method continues to attempt to correct misclassification’s in the training data, you need to be careful that the training data is of a high-quality.
Outliers: Outliers will force the ensemble down the rabbit hole of working hard to correct for cases that are unrealistic. These could be removed from the training dataset.
Noisy Data: Noisy data, specifically noise in the output variable can be problematic. If possible, attempt to isolate and clean these from your training dataset.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly