Decision Trees and Networks Flashcards

(21 cards)

1
Q

What is a decision tree?

A

It is a decision support tool that uses a tree-like model of decisions and their possible consequences.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the goal of a decision tree?

A

It aims to predict the value of a target variable based on several input variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the two types of decision tree?

A
  1. Classification Tree
  2. Regression Tree
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How is the impurity of leaves quantified?

A

Using the Gini Impurity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How is the Gini Impurity for a leafcalculated?

A

The Gini Impurity of n is 1 minus the sum of the probability for class i squared.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How is the total Gini Impurity calculated?

A

Total Gini Impurity is a weighted average of the Gini Impurity for all of the leaves.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How is Gini Impurity calculated for numeric data?

A

Before calculating Gini impurity observations need to be sorted by the numeric category and the average of each pair of observations taken. The gini impurity is then calculated for each of these as if it where categorical.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Which category is used as the root of the tree?

A

The category with the lowest Gini impurity score.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the two types of pruning for decision trees?

A
  1. Pre-pruning
  2. Post-pruning
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is pruning?

A

Pruning is a data compression technique that reduces the size of a tree by removing sections that are non-critical and redundant to classify instances.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Why should pruning be used?

A

Because it reduces the complexity of the final classifier improving the prediction accuracy by reducing overfitting.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the aim of pre-pruning?

A

To make sure the tree does not contain too many layers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How is pre-pruning implemented?

A

Pre-pruning specifies the minimum number of samples that must be present in the nodes. This means the pruning takes place as the tree is being created.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How is Post-pruning implemented?

A

Starting at the lowest branch the error of the whole tree (e) and the error of the tree minus the branch (e’) are compared. Remove the lowest branch if e’ < e. Repeat.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is centrality?

A

How important a node is in a network

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is degree?

A

The number of edges a node has

17
Q

What is an Eigenvector Centrality?

A

A measure of a neighbour’s relative importance with each vertex being proportional to the sum of its neighbours.

18
Q

What is PageRank?

A

The importance derived form a neighbour based on its out-degree

19
Q

What is a nodes closeness?

A

The mean distance to other vertices (geodisic paths)

20
Q

What is a nodes betweenness?

A

The sum of geodisic paths passing through a vertex