Gradient Boosting Trees Flashcards
(12 cards)
What is Gradient Boosting?
An ensemble technique where trees are build sequentially. Each new tree corrects the mistakes of previous trees.
How does Gradient Boosting work?
Gradient boosting optimizes a loss function, and each iteration, a new tree is added to reduce the residual errors.
What are the steps in gradient boosting?
Initial prediction
Residual Calculation
Tree Construction
Update
Repeate
What is the initial prediction in gradient boosting?
Starting with a simple prediction, usually the most frequent class for classification or the mean value for regression
What is the residual calculation in gradient boosting
The residuals (errors) between predicted and actual values
What is tree construction in gradient boosting?
A new tree is trained to predict the residuals of the previous models
What is the update in gradient boosting?
The predictions are updated by adding the predictions of the new tree, multiplied by a learning rate to control the contribution of each tree
What is repeat in gradient boosting?
Continuing the process and adding trees until the model reaches a specified number of trees or achieves the desired level of accuracy
What are the advantages of gradient boosting?
High predictive power, it outperforms random forest
Flexibility, it supports various loss functions
Handles imbalanced data by focusing on more difficult to predict instances
What are the limitations of gradient boosting?
Prone to overfitting
Longer training time
Sensitive to hyperparameters (careful tuning of number of trees, learning rate, max depth)
When should you used Gradient Boosting?
When accuracy is critical
When you have imbalanced data
When you have time for hyper tuning
When you have complex problem spaces
Examples - healthcare, apparently DoD
When should you use random forest
Need fast and robust model
You have a large data set
Interpretability is important
You need to avoid overfitting
Examples - banking