Chapter 6 Flashcards
(19 cards)
What tasks can decision trees accomplish
– Classification
– Regression
– Multioutput tasks
Do decision trees require scaling or centering
Decision Tree training does not require feature scaling or centering
Define Samples of a decision tree node
how many training instances
Define Value of decision tree nodes
classification of training instances
Define Class of Decision tree nodes
classification result
What type of trees does Scitkit-Learn CART produce
Binary Trees
What value of a Gini Impurity makes a node “pure”
gini=0
For the Gini impurity algorithm what does the Pi,k stand for
𝑝𝑖,𝑘 is ratio of class 𝑘 instances among training instances of ith node
Decision Tree can use leaf node to estimate class probabilities
CART
Classification and Regression Tree
How does the cart algorithm work?
Repeated splitting of training set
– Find split that produces the purest subsets (weighted by their size)
– Stop when maximum depth is reached or cannot find split that reduces impurity
- Greedy algorithm
Which is faster to compute entropy or gini?
Gini
What is the advantage of Entropy over Gini?
Entropy provides slightly more balanced trees
Decision Trees are a nonparametric model, what does that mean?
– No limit on number of parameters before training
– Typically leads to overfitting
Hyperparameters to constrain Decision Trees to avoid overfitting
– Maximum tree depth: max_depth
– Minimum samples in node before splitting: min_samples_split
– Minimum samples in leaf: min_samples_leaf or min_weight_fraction_leaf
– Maximum leaf nodes: max_leaf_nodes
– Maximum features evaluated for splitting: max_features
How can we optimize Max_ and Min_ hyperparameters to regularize a model
Increasing min_* and decreasing max_* will regularize model
Benefits of decision trees
- Easy to use
– Easy to understand and interpret
– Versatile and powerful
Short commings of decision trees
– Orthogonal decision boundaries
– Sensitive to small variations in training data
* Very different model if just one training value removed
* Stochastic algorithm may lead to different models on same data
(unless random_state hyperparameter is set)
How do Random Forests address instability
Averaging predictions over many trees