Tree Construction Flashcards

1
Q

What is entropy and how is it measure?

A

Entropy is a way to measure information in bits

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is information gain?

A

a value that increases with the average ‘purity’ of the subsets

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How is information gain measured?

A

Entropy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the steps to recursively construct a tree using the divide and conquer method?

A
  1. Choose a root node
  2. Create branches for every possible attribute value
  3. Split the instances into subsets
  4. repeat recursively until all instances have the same class value
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How do we choose the best attribute?

A

choose the attribute with the highest information gain

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Why do all leaves need to be ‘pure’?

A

Because sometimes identical instances have different classes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

When does splitting stop?

A

When the data cant split any further

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are highly branching attributes?

A

When attributes with a large number of values -> attributes are likely to be pure if this is the case

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what are ensembles of trees?

A

collection of different trees, let them vote on classification

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what is the bagging method?

A

change input data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what is the randomization method?

A

semi-random split selection

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what is the boosting method?

A

build trees subsequently, focus on mistakes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly