EM Flashcards

(27 cards)

1
Q

What problem does Expectation Maximization (EM) solve?

A

It optimizes likelihood in models with hidden or latent variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the basic idea behind EM?

A

Alternating between estimating hidden variables and optimizing model parameters.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the two main steps in EM?

A

E-step (Expectation) and M-step (Maximization).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What happens in the E-step of EM?

A

Compute the posterior distribution of the latent variables given current parameters.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What happens in the M-step of EM?

A

Update parameters to maximize the expected complete-data log-likelihood.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Why can’t we directly maximize the likelihood in latent variable models?

A

Because the log of a sum over hidden variables doesn’t simplify easily.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the name of the function that EM maximizes as a lower bound?

A

The Evidence Lower Bound (ELBO) or free energy.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Why does EM increase the log-likelihood at every iteration?

A

Because each step maximizes a lower bound on the true log-likelihood.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What does the ELBO become tight (equal to the log-likelihood)?

A

When the approximate posterior matches the true posterior.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What distribution is commonly used in the E-step for GMMs?

A

The posterior responsibilities: p(c | x, θ).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What does qₙc represent in GMM EM?

A

The probability that data point xₙ belongs to cluster c.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the formula for qₙc in the E-step of GMMs?

A

qₙc = πc · N(xₙ | μc, Σc) / Σk πk · N(xₙ | μk, Σk)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is updated during the M-step of GMM EM?

A

The means, covariances, and mixing coefficients of the Gaussians.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the formula for updating μc in GMM EM?

A

μc = Σ qₙc xₙ / Σ qₙc

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the formula for updating Σc in GMM EM?

A

Σc = Σ qₙc (xₙ - μc)(xₙ - μc)ᵀ / Σ qₙc

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the formula for updating πc in GMM EM?

A

πc = (1/N) Σ qₙc

17
Q

What type of model is a Gaussian Mixture Model (GMM)?

A

A probabilistic generative model with latent variables.

18
Q

How is EM related to K-means?

A

K-means is a limiting case of GMM with hard assignments and small variance.

19
Q

When does GMM reduce to K-means?

A

When covariances are isotropic and approach zero.

20
Q

What kind of assignment does K-means make?

A

Hard assignments (each point to one cluster).

21
Q

What kind of assignment does EM in GMM make?

A

Soft assignments using probabilities.

22
Q

What type of optimization method is EM?

A

An iterative coordinate ascent on a lower bound of the log-likelihood.

23
Q

Why is the log-likelihood hard to compute directly in GMMs?

A

Because it involves the log of a sum over components.

24
Q

What is a key requirement for EM to work?

A

That we can compute the posterior and maximize the expected log-likelihood.

25
Can EM get stuck in local optima?
Yes, EM converges to a local maximum and is sensitive to initialization.
26
What is the benefit of using EM in probabilistic PCA?
It allows inference with missing data and fits the model probabilistically.
27
What is the generative model in probabilistic PCA?
x = Wz + μ + noise, where z is a latent variable.