Lesson 2 Flashcards
(9 cards)
What is the exhaustive search?
Decision trees
Trying every possible way to split the data set and constructing every possible tree
For every feature the tree grows exponentially
What does a high impurity mean?
This means that a node has a more equal distribution
How can you calculate the depth of a decision tree?
It is the maximum amount of questions you can ask to get an answer
What are 7 advantages of Decision trees
- easy to understand
- can be visualised
- handles both numerical and categorical data
- requires little data preparation
- captures non-linear relationships
- feature importance
- fast to train and predict
What are 4 drawbacks of decision trees
- Prone to overfitting
- Sensitive to minor details
- Unstable
- Biased toward features with more levels
What is ensemble learning?
When ‘weak learner’ models are combined either sequential or parallel.
“The wisdom of the crowd”
What is Bagging?
Decision trees
Using different training sets to reduce variance
Parallel ensemble learning technique
What is Random Forest?
Using different training sets and different features to reduce variance