Chapter 28 Bagging and Random Forest Flashcards

1
Q

WHAT IS THE BOOTSTRAP METHOD?

P136

A

The bootstrap is a powerful statistical method for estimating a quantity from a data sample, e.g. mean, STD, even quantities used in ML algorithms like learned coefficients

_________

Bootstrapping is a method of inferring results for a population from results found on a collection of smaller random samples (instances) of that population, using replacement during the sampling process. (When sampling one sample (instance), replacing it back in the dataset, so one sample can be picked multiple times)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

WHAT IS BAGGING? P137

A

Bagging is the application of the bootstrap procedure to a high-variance ML algorithm, typically decision trees.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

WHAT ARE THE STEPS OF BAGGING? P137

A

1- Create many (e.g. 100) random subsamples of our dataset with replacement
2- Train a CART model on each sample
3- Given a new dataset, calculate the average prediction from each model.

Ref

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

ARE THE TREES IN BAGGING, PRUNED? WHY? P137

A

When bagging with decision trees, we are less concerned about individual trees overfitting the training data. For this reason and for efficiency, the individual decision trees are grown deep (e.g. few training samples at each leaf-node of the tree) and the trees are NOT pruned. These trees have high variance, which is an important characteristic of sub-models when combining predictions using bagging.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

WHAT PROBLEM CAN THE GREEDY ALGORITHM OF DECISION TREES CAUSE IN BAGGED DECISION TREES? P138

A

A problem with decision trees is that they are greedy. They choose which variable to split on, using a greedy algorithm that minimizes error. As such, even with Bagging, the decision trees can have a lot of structural similarities and in turn, result in high correlation in their predictions. Combining predictions from multiple models in ensembles work better if the predictions from the sub-models are uncorrelated or at best, weakly correlated.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

WHAT DOES RANDOM FOREST DO TO AVOID GREEDINESS? P138

A

The decision trees look through all features to find the best split-point, random forest randomly chooses a sample of the features of which to search.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

WHAT IS A GOOD DEFAULT FOR THE NUMBER OF FEATURES HYPERPARAMETER IN RANDOM FOREST? P138

A

For classification: √p p= number of all the features
For regression: p/3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

WHEN USING BOOTSTRAP METHOD, WHAT ARE THE SAMPLES THAT ARE NOT PICKED CALLED? P138

A

Out-Of-Bag samples (OOB)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

HOW IS THE PERFORMANCE OF EACH MODEL IN BAGGING AND THE WHOLE ENSEMBLE MEASURED? WHAT IS THIS ESTIMATE OF PERFORMANCE CALLED? P138

A

The performance of each model on its left out samples, when averaged, can provide an estimated accuracy of the bagged models. It’s often called the OOB estimate.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

HOW CAN WE PREPARE DATA FOR BAGGED CART? P139

A

Bagged CART does not require any special data preparation other than a good representation of the problem.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly