Week 4 Flashcards
(53 cards)
“What is the general purpose of Monte Carlo methods discussed in the context of a target distribution π?”
To evaluate expectations of the form Eπ(f) = ∫ f(x)π(x)dx, especially when analytical evaluation is not possible.
“According to the Strong Law of Large Numbers (SLLN), how can Eπ(f) be approximated using i.i.d. samples X1, …, Xn from π?”
Eπ(f) ≈ (1/n) * Σ[i=1 to n] f(Xi) for large n.
“In the toy example, what is the target distribution π(x) and the function f(x)?”
π(x) is the standard normal density (1/√(2π)) * e^(-x²/2), and f(x) = x.
“What is the exact value of Eπ(x) when π is the standard normal distribution?”
Eπ(x) = 0.
“What R function is mentioned for sampling from a Normal distribution?”
rnorm(n, mean, sd)
“What R function is mentioned for computing the sample mean?”
mean(x)
“What does the graph on slide 5 illustrate?”
It shows the convergence of the running sample mean xbar(s) = (1/s) * Σ[i=1 to s] xi towards the true mean (0 in this example) as s increases, illustrating the SLLN.
“What are the two main problems mentioned that can prevent direct Monte Carlo simulation?”
- It’s not possible to sample directly from the target distribution π. 2. π is only known up to a normalizing constant.
“What is a potential solution mentioned for the problem of not being able to sample directly from π?”
Importance Sampling.
“What is a potential solution mentioned for both problems (cannot sample directly, unknown normalizing constant)?”
Markov Chain Monte Carlo (MCMC) Methods.
“In a typical Bayesian model setup, how are the data Y1, …, Yn related to the parameter θ?”
Conditionally independent and identically distributed given θ, following a distribution fθ: Y1, …, Yn | θ ~ iid fθ.
“How is the parameter θ itself modelled in a Bayesian setup?”
It is treated as a random variable with a prior distribution π: θ ~ π.
“According to Bayes’ Theorem (proportional form), how is the posterior distribution π(θ|y) related to the likelihood and prior?”
Posterior Distribution ∝ Likelihood × Prior Distribution, specifically π(θ|y) ∝ [Π(i=1 to n) fθ(yi)] * π(θ).
“What is the core idea behind MCMC algorithms for approximating a target distribution π?”
To construct a Markov Chain whose stationary distribution is the target distribution π.
“How is the expectation Eπ(f) approximated using samples X1, …, Xn from a Markov Chain with stationary distribution π?”
Using the ergodic theorem (SLLN for Markov Chains): Eπ(f) ≈ (1/n) * Σ[i=1 to n] f(Xi) for large n (after burn-in).
“How can the posterior expectation E[θ|y] be expressed as an integral?”
E[θ|y] = ∫ θ * π(θ|y) dθ.
“How can the posterior cumulative distribution function P(θ < a | y) be expressed as an integral?”
P(θ < a | y) = ∫ I(θ < a) * π(θ|y) dθ, where I is the indicator function.
“What is the definition of the indicator function I(A)?”
I(A) = 1 if A is true, and I(A) = 0 if A is not true.
“How can the posterior expectation E[θ|y] be approximated using N samples θ(1), …, θ(N) from the posterior distribution π(θ|y)?”
E[θ|y] ≈ (1/N) * Σ[i=1 to N] θ(i).
“How can the posterior probability P(θ < a | y) be approximated using N samples θ(1), …, θ(N) from the posterior distribution π(θ|y)?”
P(θ < a | y) ≈ (1/N) * Σ[i=1 to N] I(θ(i) < a).
“What are the two specific MCMC algorithms mentioned?”
The Gibbs sampler and the Metropolis-Hastings sampler.
“What is the goal of the Gibbs Sampler in the context of a random vector (X1, …, Xd) with density π?”
To generate samples from the joint density π by iteratively sampling from the full conditional distributions.
“What is the output of the Gibbs sampler algorithm?”
A d-dimensional Markov chain {(X1(n), …, Xd(n)), n = 0, 1, …} whose distribution converges to π.
“How is the j-th full conditional density πj(y | xi, i ≠ j) defined?”
πj(y | xi, i ≠ j) = π(x1, …, xj-1, y, xj+1, …, xd) / π(x-j), where π(x-j) is the marginal density integrating out the j-th component.