XGBoost MLM Flashcards

1
Q

XGBoost, short for eXtreme Gradient Boosting

A

XGBoost, short for eXtreme Gradient Boosting, is an optimized distributed gradient boosting library that provides a highly efficient, flexible, and portable solution for machine learning tasks.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q
  1. Introduction
A

XGBoost is a decision-tree-based ensemble machine learning algorithm that uses a gradient boosting framework. It is renowned for its execution speed and model performance, and it has been a go-to choice for many winning teams of machine learning competitions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q
  1. Gradient Boosting
A

Gradient Boosting is a technique that produces a prediction model in the form of an ensemble of weak prediction models, typically decision trees. It builds the model in a stage-wise fashion and it generalizes them by allowing optimization of an arbitrary differentiable loss function.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q
  1. Speed and Performance
A

XGBoost is recognized for its speed and performance. The core XGBoost algorithm is parallelizable, which means it can harness all of the processing power of modern multi-core computers. Furthermore, it is also capable of being distributed across networks of computers to handle larger datasets.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q
  1. Regularization
A

XGBoost has an in-built regularization which helps to reduce overfitting. In fact, XGBoost is also known as a ‘regularized boosting’ technique.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q
  1. Handling Missing Values
A

XGBoost has an in-built routine to handle missing values, which allows the user to choose a different split for handling missing values, and uses this to learn the best imputation value for missing values based on reduction in the loss function.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q
  1. Tree Pruning
A

XGBoost uses a more principled approach for controlling model complexity and preventing overfitting through its depth-first tree pruning strategy, where splits are chosen to optimize for the loss function and a specified maximum depth, while traditional gradient boosting methods use a greedy algorithm.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q
  1. Built-in Cross-Validation
A

XGBoost allows a user to run a cross-validation at each iteration of the boosting process, making it easy to get the exact optimum number of boosting iterations in a single run.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q
  1. Flexibility
A

XGBoost allows users to define custom optimization objectives and evaluation criteria, which adds a whole new dimension to the model as now you can solve for almost all types of problems.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q
  1. Strengths and Limitations
A

XGBoost performs well in many predictive tasks and is often a key component of winning entries in machine learning competitions. However, due to its complexity and flexibility, it requires careful tuning of the hyperparameters for best performance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q
  1. Applications
A

XGBoost has been used successfully in many machine learning and data science competitions and has a wide range of applications in industries such as banking, e-commerce, and healthcare.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly