Gradient Boosting Trees Flashcards

(12 cards)

1
Q

What is Gradient Boosting?

A

An ensemble technique where trees are build sequentially. Each new tree corrects the mistakes of previous trees.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How does Gradient Boosting work?

A

Gradient boosting optimizes a loss function, and each iteration, a new tree is added to reduce the residual errors.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the steps in gradient boosting?

A

Initial prediction
Residual Calculation
Tree Construction
Update
Repeate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the initial prediction in gradient boosting?

A

Starting with a simple prediction, usually the most frequent class for classification or the mean value for regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the residual calculation in gradient boosting

A

The residuals (errors) between predicted and actual values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is tree construction in gradient boosting?

A

A new tree is trained to predict the residuals of the previous models

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the update in gradient boosting?

A

The predictions are updated by adding the predictions of the new tree, multiplied by a learning rate to control the contribution of each tree

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is repeat in gradient boosting?

A

Continuing the process and adding trees until the model reaches a specified number of trees or achieves the desired level of accuracy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the advantages of gradient boosting?

A

High predictive power, it outperforms random forest
Flexibility, it supports various loss functions
Handles imbalanced data by focusing on more difficult to predict instances

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the limitations of gradient boosting?

A

Prone to overfitting
Longer training time
Sensitive to hyperparameters (careful tuning of number of trees, learning rate, max depth)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

When should you used Gradient Boosting?

A

When accuracy is critical
When you have imbalanced data
When you have time for hyper tuning
When you have complex problem spaces

Examples - healthcare, apparently DoD

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

When should you use random forest

A

Need fast and robust model
You have a large data set
Interpretability is important
You need to avoid overfitting

Examples - banking

How well did you know this?
1
Not at all
2
3
4
5
Perfectly