16+17 Flashcards

1
Q

What is another way of writing p(toothache, catch, cavity) if we know that cavity and catch have conditional independence with toothache?

A

p(toothache, catch, cavity) = p(toothache | Catch, cavity) p(Catch, Cavity) = p(Toothache | Catch, Cavity) p(Catch | cavity) p(Cavity) = p(Toothache | Cavity) p(Catch | Cavity) p(Cavity).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How is a bayesian network constructed?

A

There are often situtations where we know about some dependencies and conditional independencies of others.
In a bayesian network: Each node represents one of the random variables in the domain. A directed arrow between two nodes means direct influence (from cause to effect). Sister nodes are conditionally independent if we know the parent variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What can Bayesian networks definitely not have?

A

Cycles. It must be a directed acyclic graph.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How are probability distributions added to Bayesian networks?

A

A node with no parents gets a prior probability distribution for the variable it represents. A node with parents gets a conditional probability distribution for the variable it represents. For each possible value of the parent variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How compact is a fully boolean bayesian network?

A

A full joint probability distribution would have 2^n. If a Bayesian network has K parent nodes it will need O(n2^K) numbers. This means it effectively scales linearly with the number of nodes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How do we calculate probabilities from a Bayesian network?

A

Look up appropriate probabilities in each prior and conditional probability table in the whole network and multiply together.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How do we build a bayesian network?

A

Choose an ordering of variables, from X1 to Xm. For i = 1 to m add Xi to the network and add links back to all nodes already added which Xi depends on. This implements the chain rule. We can then cancel links based on what variables are conditionally independent. Its best to start with causes and then add effects to maximise chance of simplifying network.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What does a decision tree do?

A

A decision tree makes a sequence of partitions of the training data, one attribute at a time. It is basically a big nested conditional statement.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What can decision trees express?

A

Any function of the input attributes. If need be we can build a tree with one path to a leaf node for each training example. Though this will overfit data, we want the simplest decision tree which is consistent with the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How can we make a simple decision tree?

A

Calculate expected entropy of distributions on values of candidate attribute. Pick the attribute with lowest expected entropy (most chaotic percentages). Continue this until every example is correctly classified.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly