IFN580 Week 4: DT Modelling (11%) Flashcards
(21 cards)
4Which of the following is not an advantage of a decision tree:
a) Decision making is explainable.
b) Fast inference time.
c) Reasonable training time.
d) Can learn non-linear decision boundaries.
e) Can only handle a small number of features
e) Can only handle a small number of features
True or false? Random forests can reduce variance (overfitting) through the use of
bagging (bootstrap aggregating).
TRUE
How does a boosting model (e.g. XGBoost) achieve greater performance over a
single tree?
Fits a tree to residual errors from the previous model. This process is repeated
multiple times
What approach do Decision Trees belong to?
Supervised Learning
What is the purpose of a Decision Tree?
To split the data based on ATTRIBUTES in order to classify/predict
What is entropy?
The measure of impurity
E(s) = -(pi * log2 * pi)
What is the root node?
The starting point
What is a decision node?
Internal nodes where the data splits
What is a branch?
The path from one node to another
What are leaf nodes?
Terminal nodes that contain the prediction
What is the Gini formula?
Gini(s) = -(pi2)
What is Pruning?
A technique that removes useless sections. Prevents overfitting and helps generalise the data
A large tree depth results in:
Low bias, high variance
which leads to overfitting
What is a decision boundary?
The line that separates different classes/predicted values in the feature space
What is a maximal tree?
A tree with maximum depth achieved. It has low bias, high variance and is therefore overfitted
what is an optimal tree?
A tree with good generalisation
What is ensemble modelling?
When multiple trees are combined to improve prediction
What is random forest?
Same as bagging + nodes split at random subsets of features
What is Bagging?
An ensemble method that trains multiple models on random subsets of data
What is boosting?
When multiple trees are built sequentially in an attempt to fix errors from the previous one e.g. XGBoost
What is early stopping?
A pre-pruning technique where you set a minimum size for the tree leaves