2) Entropy and Fisher Information Flashcards

Question 1

Q

What is the surprise / shannon information of an event

Answer

A

−log(𝑝)

Question 2

Q

What is the log-odds ratio

Question 3

Q

What is Entropy

Answer

A

A quantatitive measure of the distribution of probability mass. Low entropy means probability mass is concentrated

Question 4

Q

What is Shannon Entropy of the Distribution P

Question 5

Q

What are the Bounds for Shannon Entropy

Question 6

Q

What is the difference between Shannon Entropy and Differential Entropy

Answer

A

Shannon entropy is only defined for discrete random variables
Differential Entropy results from applying the definition of Shannon entropy to a continuous random variable
Differential entropy is not bounded below by zero and can be negative

Question 7

Q

What is Differential Entropy

Question 8

Q

What does the size of the entropy imply

Answer

A

Large entropy implies that the distribution is
spread out whereas small entropy means the distribution is concentrated

Question 9

Q

What are maximum entropy distributions and what do they signify about a random variable

Answer

A

Maximum entropy distributions are considered minimally informative, indicating the distribution is highly spread out and thus less informative about the random variable’s outcomes
Key Examples Include -
* The discrete uniform distribution, representing maximum entropy among all discrete distributions.
* The normal distribution, as the maximum entropy distribution for continuous variables over [−∞, ∞] with specified mean and variance.
* The exponential distribution, for continuous distributions on [0, ∞] with a given mean.

Question 10

Q

What is Cross-Entropy

Question 11

Q

What is the Cross Entropy of Discrete and Continous Variables

Question 12

Q

What is the properties of cross-entropy between two distributions 𝐹 and 𝐺

Answer

A

Cross-entropy is not symmetric with regard to 𝐹 and 𝐺, because the expectation is taken with reference to 𝐹
If both distributions are identical cross-entropy reduces to Shannon and differential entropy, respectively: 𝐻(𝐹, 𝐹) = 𝐻(𝐹).

Question 13

Q

What is Gibb’s Inequality

Question 14

Q

What is KL Divergence / Relative Entropy

Question 15

Q

What does KL Divergence measure

Answer

A

𝐷KL(𝐹, 𝐺) measures the amount of information lost if 𝐺 is used to approximate 𝐹.
If 𝐹 and 𝐺 are identical (and no information is lost) then 𝐷KL(𝐹, 𝐺) = 0

Question 16

Q

What are the properties of KL Divergence

Answer

Study These Flashcards

A

Asymmetry: 𝐷KL(𝐹, 𝐺) ≠ 𝐷KL(𝐺, 𝐹), i.e. the KL divergence is not symmetric, 𝐹 and 𝐺 cannot be interchanged.
Zero Equivalence: 𝐷KL(𝐹, 𝐺) = 0 if and only if 𝐹 = 𝐺
Non-negativity: 𝐷KL(𝐹, 𝐺) ≥ 0
Coordinate Invariance: 𝐷KL(𝐹, 𝐺) remains invariant under coordinate transformations