Flashcards in Decision Trees Deck (13):

1

## What role does Entropy play?

### Controls how the DT splits the data. It's the measure of impurity in a bunch of examples. Impurity being how uniform are the classes in the example set.

2

## What is the formula for Entropy?

### Entropy = Sum(i) { p(i) * log2(p(i)) }, where p(i) = fraction of examples in class i, and sum(i) sums over all classes.

3

## What is the entropy of all examples being same class?

### 0

4

## What is information gain?

### entropy(parent) - [weighted average]*entropy(children)

5

## How does the decision tree utilize information gain?

### It maximizes information gain to determine the splits.

6

## Give intuitive explanation for how to remember bias

### I can train the model with all sorts of data but it's bias towards it's original behavior and doesn't change

7

## Give intuitive explanation for how to remember variance

### It cares so very much about the data it's being trained on and will change it's behavior to match it's behavior to whatever data it sees

8

## What are DT strengths and weaknesses?

###
Strengths: Easy to use, graphically interpretable (knowledge representation), can build bigger classifiers from them with ensemble methods

Weaknesses: Prone to overfitting especially with lots of features,

9

## Give an example of remembering xor logic gate

### When someone asks do you want to go to the movie or bowling. usually they mean xor as in pick one or the other but not both and not neither

10

## Decision tree space - compare xor and or

###
xor - exponential space for nodes

or - linear as you add nodes

11

## What is Inductive Bias

### The inductive bias of a learning algorithm is the set of assumptions that the learner uses to predict outputs given inputs that it has not encountered. A classical example of an inductive bias is Occam's Razor, assuming that the simplest consistent hypothesis about the target function is actually the best.

12

## What is Preference Bias?

###
A preference bias is when a learning algorithm incompletely searches a complete hypothesis

space. It chooses which part of the hypothesis space to search. An example is decision trees

13