Chapter 7 Flashcards

Question 1

Q

Rational behind Random Forests

Answer

A

“Wisdom of the crowd”
* Aggregate group of predictors (ensembles) to get better prediction
* Ensembles can be applied to different classifiers and regressors
* Random Forest uses many Decision Trees
– One of the most powerful Machine Learning algorithms

Question 2

Q

Voting Classifiers

Answer

A

Train different classifiers (algorithms) on same data
Then predict by majority vote

Question 3

Q

Weak learners of even 51% accuracy can still preform 75% accuracy with ensemble

Question 4

Q

How do Voting classifiers do against individual classifiers

Answer

A

Voting classifier outperforms any individual classifier

Question 5

Q

When would you use Bagging and Pasting

Answer

A

When the same algorithm is used for all the voting classifiers

Question 6

Q

How do Bagging and Pasting work

Answer

A

Use same algorithm, but different training data
– Choose random subset from training set

Question 7

Q

bagging

Answer

A

With replacement

Question 8

Q

pasting

Answer

A

Without replacement

Question 9

Q

How do Bagging and Pasting models predict

Answer

A

Aggregation of each predictor’s output:
* Statistical mode (most frequent prediction) for classification
* Average for regression
– Ensemble has same bias but lower variance than one predictor trained on data

Question 10

Q

How does Bagging and Pasting work?

Answer

A

Training and
predictions can
be done in
parallel on
different CPU
cores

Question 11

Q

What is Out-of-Bag Evaluation (OOB)

Answer

A

Bagging classifier samples m instances from training set of size m
– Sampling with replacement: some training instances will not be picked
Oob instances can be used for evaluation
– No need for separate validation or cross-validation

Question 12

Q

Does Random Forests uses bagging or pasting?

Question 13

Q

How can Random Forests can measure relative importance of each feature

Answer

A

Measure of how much tree nodes that use feature reduce impurity
– Weighted average across all trees

Question 14

Q

Explain Boosting

Answer

A

“Boosting” or “hypothesis boosting” combines weak learners
– Training of learners is done sequentially
– Each learner is trying to correct its predecessors

Question 15

Q

Most popular boosting methods

Answer

A

– AdaBoost (“Adaptive Boosting)
– Gradient Boosting

Question 16

Q

How does ada boost work

Answer

Study These Flashcards

A

New predictor corrects
predecessor by paying
attention to outliers
– More focus on training
instances where
underfitting occurred
– Relative weight of
misclassified instances
is increased in next
iteration

Question 17

Q

How does Gradient Boosting work

Answer

Study These Flashcards

A

Also uses sequence of predictors
– Instead of tweaking instance weights as in AdaBoost, fits to residual errors
When using Decision Trees as base estimators
– “Gradient Tree Boosting” or “Gradient Boosted Regression Tree” (GBRT)

Question 18

Q

Explain Stacking

Answer

Study These Flashcards

A

“Stacking” or “Stacked
Generalization” trains
aggregation function
– Final predictor that aggregates
predictors is called “blender” or
“meta learner”
Training of blender
based on “hold-out set”
– Reserve some training
instances

Chapter 7 Flashcards

(18 cards)