Decision Trees Flashcards

1
Q

What are the decision trees? 👶

A

This is a type of supervised learning algorithm that is mostly used for classification problems. Surprisingly, it works for both categorical and continuous dependent variables.

In this algorithm, we split the population into two or more homogeneous sets. This is done based on most significant attributes/ independent variables to make as distinct groups as possible.

A decision tree is a flowchart-like tree structure, where each internal node (non-leaf node) denotes a test on an attribute, each branch represents an outcome of the test, and each leaf node (or terminal node) holds a value for the target variable.

Various techniques : like Gini, Information Gain, Chi-square, entropy.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How do we train Decision Trees?

A

Start at the root node.
For each variable X, find the set S_1 that minimizes the sum of the node impurities in the two child nodes and choose the split {X,S} that gives the minimum over all X and S.
If a stopping criterion is reached, exit. Otherwise, apply step 2 to each child node in turn.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the main parameters of the decision tree model? 👶

A

maximum tree depth
minimum samples per leaf node
impurity criterion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How do we handle categorical variables in decision trees? ‍⭐️

A

Some decision tree algorithms can handle categorical variables out of the box, others cannot. However, we can transform categorical variables, e.g. with a binary or a one-hot encoder.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the benefits of a single decision tree compared to more complex models? ‍⭐️

A

easy to implement
fast training
fast inference
good explainability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How can we know which features are more important for the decision tree model? ‍⭐️

A

Often, we want to find a split such that it minimizes the sum of the node impurities. The impurity criterion is a parameter of decision trees. Popular methods to measure the impurity are the Gini impurity and the entropy describing the information gain.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly