10 Probabilistic Reasoning Flashcards

1
Q

Status of probability sentences

A

Statement: “The patient has cavity with probability 0.8

In logic, a sentence is true or false, depending on the interpretation and the world. In probability theory, the probability assigned to a sentence depends on the evidence so far.
• Prior (unconditional) probability: Before any evidence
• Posterior (conditional) probability: After some evidence

Probability is more like entailment than truth!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Probability, utility and decisions

A

The agent can use probability theory to reason about uncertainty. The agent can use utility theory for rational selection of actions based on preferences. Decision theory is a general theory for combining probability with rational decisions.

Decision theory = Probability theory + Utility theory

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Basic probability notation

A

A probability model is a set of propositions, expressed in terms of random variables with domains:
• Boolean - E.g. Cavity : < true, f alse >
• Discrete - E.g. Weather : < sunny, rainy, cloudy, snow >
• Continuous - E.g. Index : [0, 1]

An atomic event is an assignment of particular values to all variables of the domain
• E.g. Cavity = false ^ Toothache = true
• Mutually exclusive (only one event can be true at a time)
• Exhaustive (at least one must be true)

Prior (unconditional) probability of a proposition:
P(A) P(Cavity) = 0.1 …i.e. P(Cavity = true) = 0.1

Probability distribution of variable P(v)
P(Weather) = (0.7, 0.2, 0.08, 0.02)

Joint probability distribution
• Table of probabilities for all combinations: P(v1,v2)
• P (Weather, Cavity) is a 4 x 2 table of probabilities (must sum to 1)
• Full joint distribution: all domain variables included

Conditional (posterior) probability: P(A|B)

P(Cavity | Toothache) = 0.8

Product rule:
P(A ^ B) = P(A|B)P(B)
P(A ^ B) = P(B|A)P(A)
P(A | B) = P(A ^ B) / P(B)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Problem with interference using full joint distribution

A

This approach does not scale up! Table size and calculation time is O(2n) for n Boolean variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Combining evidence

A

We can use Bayes’ rule to find probability of a state given several pieces of evidence

P(Cavity|Toothache ^ Catch) = P(Toothache ^ Catch|Cavity)P(Cavity)/P(Toothache ^ Catch)

For this to work we need conditional probability for all combinations of evidence variables. In general case there is an exponential number of conditional probabilities. For n Boolean evidence variables, we need 2^n conditional probabilities. This led AI researchers away from probability theory and towards more ad hoc systems.

Conditional independence
Using Bayes’ rule is simplified in situations of conditional independence between variables. Combining evidence does not need information for all combinations of evidence variables, only separate conditionals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Bayesian networks

A

Bayesian networks

Bayesian networks represent dependencies between variables and give concise specification of joint probability distribution. A Bayesian network is a graph where:

  • A set of random variables are the nodes of the graph
  • A set of directed links connects pairs of nodes
  • A link from X to Y means that X has direct influence on Y, or is the parent of Y
  • Each node has a conditional probability table (CPT) that quantifies effects parent nodes have on the node
  • The graph has no directed cycles (it is a directed, acyclic graph, or DAG)

It can be shown that a Bayesian network is a representation of the joint probability distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Incremental construction of Bayesian network

A

General procedure
• Choose set of relevant domain variables Xi
• Choose an ordering of the variables, preferably using causal domain knowledge (“root causes first”, etc.)
• While there are variables left
- Pick the next variable Xi and add node for i
- Set Parents(Xi) to minimal set of nodes already in net such that Xi depends directly only on these nodes
- Define conditional probability table for Xi

• Guarantees that

  • Network is acyclic
  • Axioms of probability are satisfied
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Bayesian inference

A

Probabilistic inference procedure computes posterior probability distribution for a set of query variables, given exact values for some evidence variables
P (Query|Evidence)

An agent gets values for evidence variables from its percepts and queries for other variables in order to decide on action, using two functions. Exact and approximate (Monte Carlo) methods have been developed for Bayesian inference.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Other uses of Bayesian networks

A

Make decisions based on probabilities in the network and agent utilities. Decide which additional evidence variables should be observed in order to gain useful information. Perform sensitivity analysis to understand which aspects of model have greatest impact on probabilities of query variables. Explain results of probabilistic inference to users.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Other approaches to uncertainty

A
  • Default reasoning
  • Rule-based methods
  • Dempster-Shafer theory
  • Fuzzy sets and fuzzy logic
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Default reasoning

A

Human judgmental reasoning is “qualitative” and does not rely on numerical probabilities. Default reasoning:

• Assume default state until new evidence presents itself
• If new evidence, conclusion may be retracted
This kind of reasoning is called nonmonotonic, because set of beliefs may both grow and shrink. Problems:
• Unclear semantic status of default rules
• What if two matching default rules have contradictory conclusions • Managing retraction of beliefs (truth maintenance systems)
• How to use default rules for making decisions

No default reasoning system has solved all issues, and most systems are formally undecidable, and very slow in practice

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Rule-based methods

A

Logical and rule-based reasoning systems have useful properties that probabilistic systems lack.
• Locality: Can use A => B independent of other rules
• Detachment: Can use belief independent of its justification
• Truth-functionality: Truth of complex sentence follows from truth of components

Attempts have been made to modify rule systems by attaching degrees of belief to propositions/rules. Best known is the certainty factor model (Mycin, ca. 1980). The problem is that above properties are not appropriate for uncertain reasoning, and rule-based approaches to uncertainty have fallen out of use.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Dempster-Shafer theory

A

Dempster-Shafer theory deals with the distinction between uncertainty and ignorance. Instead of computing probability P(X), it computes probability Bel(X) that evidence supports proposition. Example:

For unknown possibly non-fair coin, Bel(Heads) = Bel(¬Heads) = 0

If 90% certain that coin is fair (P(Heads) = 0.5),

Bel(Heads) = 0.5x0.9 = 0.45, Bel(¬Heads) = 0.45
One interpretation of Dempster-Shafer is that it calculates a probability interval.

• Heads interval for unknown possibly non-fair coin: [0, 1]
• Interval for 90% certain that coin is fair: [0.45, 0.55]
Width of the interval helps decide when more evidence is required

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Fuzzy sets and fuzzy logic

A

Fuzzy set theory is a means of specifying how well an object satisfies a vague description. E.g. Is “Nate is tall” true if Nate is 5’ 10”?

In fuzzy set theory, TallPerson is a fuzzy predicate and the truth value of TallPerson(X) is in [0,1]. The fuzzy predicate defines a fuzzy set that does not have sharp boundaries, i.e. not uncertainty about the world, but uncertainty about the meaning of “tall”, and many claim that fuzzy set theory is not for uncertain reasoning at all.

Fuzzy logic can be used to determine truth value of complex sentence from truth of its components. However, fuzzy logic is inconsistent with first-order logic. Despite this, fuzzy logic has been commercially successful, e.g. in automatic transmissions, trains, and video cameras.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly