Chapter 6: Second Winter, Reasoning Under Uncertainty Flashcards

1
Q

Which disappointing developments of the 1980’s led to the second AI winter?

A
  1. Simpler, cheaper and more general computers and workstations than Lisp machines were introduced
  2. The promises of the 5th Generation Computing project in Japan were not fulfilled after 10 years, so funding was not extended.
  3. Similar issues happened with the western competition.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What were 5 limits of expert systems that made way for replacement by general workstations?

A
  1. Classical formal logic is not suited for all problems
  2. Performance issues on large knowledge bases
  3. Translating expert’s knowledge to a computer is very hard
  4. Many domain expert don’t actually reason using if-then rules
  5. Some academic milestones were not commercially viable (like Dendral).

Keywords: formal logic, performance, translation, if-then rules, commercial viability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What characteristics defined the second AI winter?

A
  • Business interest had collapsed
  • Governments reluctant to new funding
  • Researchers were embarrassed or less interested
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Explain the term probability, provide two different interpretations of it.

A

Probabilities quantify uncertainty about occurrence of some event of interest. One interpretation is Bayesian probability, where probability is defined as a subjective degree of belief. Another is frequentist probability, where probability is expressed as the relative frequency of observed events.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Explain the terms sample space and events and give an example of both.

A

A sample space is a non-empty set Ω, which represents atomic things that can happen.

An event is a subset of the sample space (A ⊆ Ω), that describes “something that can happen”. singleton events are also called atomic events ({ω} with ω ∈ Ω)

A six sided die has the sample space of Ω = {1, 2, 3, 4, 5, 6}, and an event could be A = (# eyes is odd) = {1, 3, 5} ⊆ Ω.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Explain Sigma Algebras.

A

In simple words, the sigma algebra is the collection of all sets that can be formed from the sample space. A sigma algebra must contain the empty set and the entire sample space, and it must be closed under complementation and countable unions.

“Closed” means that if an element of the sigma algebra is operated on in a certain way, the resulting set must also be an element of the sigma algebra. Any countable union falls within sigma.

A sigma algebra is closed under complementation if for any event A that is in the sigma algebra, its complement (the set of all outcomes in the sample space that are not in A) is also in the sigma algebra. ((Ω \ A) ∈ Σ)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Give an example on how Sigma algebra’s can be used.

A

Consider a six-sided die, so that Ω = {1, 2, 3, 4, 5, 6}.
We can express the events
- (#eyes is odd) = {1, 3, 5} and
- (#eyes ≥ 2) = {2, 3, 4, 5, 6}.

Then (#eyes is odd ∧ #eyes ≥ 2) = (#eyes is odd) ∩ (#eyes ≥ 2) = {3, 5}.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a measurable space, a probability measure and a probability space?

A
  • The tuple (Ω, Σ) is called a measurable space
  • A probability measure P assigns a probability P(A) to any event A ∈ Σ
  • The triple (Ω, Σ, P) is called a probability space.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Explain what exactly a random variable is in relation to a (n abstract) probability space and a measurable space.

A

Let (Ω, Σ, P) be a probability space
- Ω is the sample space, which is the set of all possible outcomes of the experiment.

and let (X , F) be a measurable space.
- X is the set of all possible values that can be measured or observed.

A random variable (RV) is a map X : Ω → X, which assigns a value in the measurable space X to each outcome in the sample space Ω. It is a way to quantify the outcome of a random experiment. We can assume that any RV X discussed is measurable, meaning that the values the RV takes are in the sigma.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Explain what a pushforward probability measure is.

A

When a random variable X is defined on a probability space (Ω, Σ, P), it induces a new probability measure PX on the measurable space (X,F), called the pushforward measure of P with respect to X.

The pushforward measure allows us to calculate the probability of events in the measurable space induced by the random variable X, by mapping the events in the measurable space to the events in the sample space and calculating the probability of the corresponding sets in the sample space using the original probability measure.

The triple (X , F, PX) is also a probability space, which can be used to analyze the probability of events in the measurable space (X,F) induced by the random variable X.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a Probability Mass Function in relation to the pushforward probability measures?

A

A probability mass function (PMF) is a function that assigns a probability to each value that a discrete random variable can take. It is a special case of the pushforward measure that is used for discrete random variables.

Given a probability space (Ω, Σ, P) and a discrete random variable X : Ω → X, the PMF is a function defined as p(x) = P(X^-1({x})), where x is a value that the random variable X can take and X^-1({x}) is the set of outcomes in the sample space Ω where the random variable takes the value x. The PMF assigns a probability to each value x in the range of X, such that the sum of the probabilities over all possible values of x is equal to 1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a Probability Mass Function in relation to the pushforward probability measures?

A

A probability mass function (PMF) is a function that assigns a probability to each value that a discrete random variable can take. It is a special case of the pushforward measure that is used for discrete random variables.

Given a probability space (Ω, Σ, P) and a discrete random variable X : Ω → X, the PMF is a function defined as p(x) = P(X^-1({x})), where x is a value that the random variable X can take and X^-1({x}) is the set of outcomes in the sample space Ω where the random variable takes the value x. The PMF assigns a probability to each value x in the range of X, such that the sum of the probabilities over all possible values of x is equal to 1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What happens to a PMF when we talk about multiple variables at once?

A

When we talk about multiple variables at once, a PMF is typically called a joint probability mass function (JPMF). It assigns a probability to each combination of values that the multiple random variables can take.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Explain the PMF for conditional probabilities. Also provide the chain rule and apply it when you have a Joint PMF with four events.

A

Define conditional PMF p(X1 | X2) as (p(X1, X2) / p(X2))

Mathematically, if A and B are two events, the chain rule of probability is expressed as:
P(A, B) = P(A) * P(B | A)

For four events this would look like:
P(A1 , A2 , A3 , A4) = P(A1) * P(A2 | A1) * P(A3 | A1 , A2) * P(A4 | A1 , A2 , A3)

It is important to note that this rule only holds when the events are independent, meaning that the occurrence of one event does not affect the probability of the other event(s) occurring

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Define Probabilistic independence.

A

Probabilistic independence between variables X1, X2 iff their joint PMF factorises:
p(X1, X2) = p(X1)p(X2).

Hence if p(X2) > 0 we then have:
p(X1 | X2)
= p(X1, X2)/ p(X2)
= p(X1)p(X2) /p(X2)
= p(X1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Define conditional probabilistic independence:

A

Conditional independence between variables X1, X2 given X3 iff
p(X1, X2 | X3) = p(X1 | X3)p(X2 | X3)

We then write X1 ⊥⊥ X2 | X3

17
Q

Explain what Bayesian networks (BN) are and why they are useful.

A

BN’s encode conditional independencies between variables.
Core of the representation is an directed acyclic graph (DAG) G = (V, E).

Vertices (V) are given by V = {X1, . . . , Xn}.
That is, the vertices of the graph correspond to the random variables in the model.

Edges (E) indicate, roughly, “influence” of parents on children.

(As it is hard to draw in this app, please practice drawing BN’s from conditional probability tables)

18
Q

What are conditional probability tables and how do they relate to Bayesian networks?

A

The collection of parents of Xi in G is denoted pa(Xi).

A family of conditional probability tables (CPTs) for a given DAG G = (V, E) is a map P(· | ·) that gives for all Xi ∈ V a conditional PMF P(Xi | pa(Xi)).

The CPT’s and the DAGs can usually be specified with Expert knowledge.