Decision Tree MLM Flashcards

1
Q

Decision Trees

A

Decision Trees are a type of Supervised Machine Learning where the data is continuously split according to a certain parameter.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q
  1. Introduction
A

Decision Trees are a type of predictive modeling approach used in statistics, data mining, and machine learning. They are simple to understand and interpret and are often used for classification and regression tasks.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q
  1. Tree Structure
A

A decision tree consists of nodes that form a rooted tree, meaning it is a directed tree with a node called the “root” that has no incoming edges. All other nodes have one (and only one) incoming edge. Nodes having outgoing edges are known as internal nodes. All other nodes are leaves.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q
  1. Building a Decision Tree
A

The process of training a decision tree and predicting the target features is as follows: - Begin the tree with the root, which asks the most important feature question. - For every internal node, the tree considers the attribute value and goes left or right. - Continue this process until a leaf node is reached, which provides the prediction of the target value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q
  1. Splitting Criteria
A

Decision trees use various metrics for deciding the attribute on which to split the data at each step. This could be Gini impurity, information gain, or variance reduction.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q
  1. Pruning
A

Pruning is a technique in machine learning and search algorithms that reduces the size of decision trees by removing sections of the tree that provide little power to classify instances. This reduces complexity and avoids overfitting.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q
  1. Strengths
A

Decision trees are simple to understand and visualize, can handle both numerical and categorical data, and the cost of using the tree (i.e., predicting data) is logarithmic in the number of data points used to train the tree.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q
  1. Limitations
A

Decision trees can create overly complex trees that do not generalize the data well, can be unstable because small variations in the data might result in a completely different tree being generated, and learning an optimal decision tree is known to be NP-Complete as it can get stuck in local minima.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q
  1. Applications
A

Decision trees are used in a variety of fields, including medical diagnosis, credit scoring, and many areas of machine learning and artificial intelligence (AI).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly