2) Entropy and Fisher Information Flashcards

1
Q

What is the surprise / shannon information of an event

A

โˆ’log(๐‘)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the log-odds ratio

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is Entropy

A

A quantatitive measure of the distribution of probability mass. Low entropy means probability mass is concentrated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is Shannon Entropy of the Distribution P

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the Bounds for Shannon Entropy

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the difference between Shannon Entropy and Differential Entropy

A
  • Shannon entropy is only defined for discrete random variables
  • Differential Entropy results from applying the definition of Shannon entropy to a continuous random variable
  • Differential entropy is not bounded below by zero and can be negative
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is Differential Entropy

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What does the size of the entropy imply

A

Large entropy implies that the distribution is
spread out whereas small entropy means the distribution is concentrated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are maximum entropy distributions and what do they signify about a random variable

A

Maximum entropy distributions are considered minimally informative, indicating the distribution is highly spread out and thus less informative about the random variableโ€™s outcomes
Key Examples Include -
* The discrete uniform distribution, representing maximum entropy among all discrete distributions.
* The normal distribution, as the maximum entropy distribution for continuous variables over [โˆ’โˆž, โˆž] with specified mean and variance.
* The exponential distribution, for continuous distributions on [0, โˆž] with a given mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is Cross-Entropy

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the Cross Entropy of Discrete and Continous Variables

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the key characteristic of cross-entropy between two distributions ๐น and ๐บ, and what does it simplify to when ๐น = ๐บ

A
  • Cross-entropy is not symmetric with regard to ๐น and ๐บ, because the expectation is taken with reference to ๐น
  • If both distributions are identical cross-entropy reduces to Shannon and differential entropy, respectively: ๐ป(๐น, ๐น) = ๐ป(๐น).
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is Gibbโ€™s Inequality

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is KL Divergence / Relative Entropy

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What does KL Divergence measure

A
  • ๐ทKL(๐น, ๐บ) measures the amount of information lost if ๐บ is used to approximate ๐น.
  • If ๐น and ๐บ are identical (and no information is lost) then ๐ทKL(๐น, ๐บ) = 0
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are the properties of KL Divergence

A
  • Asymmetry: ๐ทKL(๐น, ๐บ) โ‰  ๐ทKL(๐บ, ๐น), i.e. the KL divergence is not symmetric, ๐น and ๐บ cannot be interchanged.
  • Zero Equivalence: ๐ทKL(๐น, ๐บ) = 0 if and only if ๐น = ๐บ
  • Non-negativity: ๐ทKL(๐น, ๐บ) โ‰ฅ 0
  • Coordinate Invariance: ๐ทKL(๐น, ๐บ) remains invariant under coordinate transformations
17
Q

How can the KL divergence be locally approximated using Expected Fisher Information

A
18
Q

Define Expected Fisher Information

A
19
Q

What is the general form of the Expected Fisher Information matrix for models with multiple parameters, including its gradient

A