gmm Flashcards

(28 cards)

1
Q

What is the key limitation of K-means that GMMs address?

A

K-means makes hard assignments and can’t handle overlapping clusters.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What type of clustering does a GMM perform?

A

Soft clustering using probabilistic assignments.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the basic assumption of a Gaussian Mixture Model?

A

Data is generated from a mixture of several Gaussian distributions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does the mixture weight πₖ represent in GMMs?

A

The prior probability of cluster k.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does the term p(x | c, θ) represent in a GMM?

A

The likelihood of x given that it came from cluster c.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the full expression for the probability of x in a GMM?

A

p(x) = Σₖ πₖ · N(x | μₖ, Σₖ)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What kind of distribution does each component in a GMM represent?

A

A multivariate Gaussian distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the role of the covariance matrix in a GMM?

A

It controls the shape and orientation of each Gaussian component.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the posterior probability p(c | x, θ) used for in GMMs?

A

It represents the responsibility or soft assignment of x to cluster c.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does the EM algorithm optimize in GMMs?

A

The log-likelihood of the observed data under the model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is computed in the E-step of EM for GMMs?

A

Responsibilities: the posterior probabilities of each cluster for each point.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is updated in the M-step of EM?

A

The means, covariances, and mixing proportions of each Gaussian component.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the formula for updating μₖ in the M-step?

A

μₖ = Σ qₙₖ xₙ / Σ qₙₖ

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the formula for updating Σₖ in the M-step?

A

Σₖ = Σ qₙₖ (xₙ - μₖ)(xₙ - μₖ)ᵀ / Σ qₙₖ

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the formula for updating πₖ in the M-step?

A

πₖ = (1/N) Σ qₙₖ

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What does the EM algorithm guarantee?

A

That the data log-likelihood will increase with each iteration.

17
Q

Does EM always find the global maximum of the log-likelihood?

A

No, EM finds a local maximum and is sensitive to initialization.

18
Q

What does a GMM reduce to when covariances become isotropic and identical?

A

It becomes equivalent to K-means clustering.

19
Q

How does K-means relate to GMMs conceptually?

A

K-means is a special case of GMM with hard assignments and fixed variances.

20
Q

What does GMM use instead of distance for assignment?

A

Probability density functions based on Gaussian distributions.

21
Q

What type of clustering method is a GMM?

A

A generative, probabilistic clustering model.

22
Q

What kind of outputs does GMM provide for each data point?

A

Probabilities of membership in each cluster.

23
Q

What is the main advantage of GMM over K-means?

A

It can model elliptical clusters and overlapping data regions.

24
Q

Why is GMM considered more flexible than K-means?

A

Because it learns full covariance matrices and uses soft assignments.

25
In which step are cluster labels assigned in GMM?
After computing responsibilities in the E-step.
26
What does the log-likelihood function in GMM involve?
A log of a sum over weighted Gaussians for each data point.
27
What does the E-step of EM depend on?
The current parameter estimates of the Gaussians and mixing proportions.
28
What makes GMMs better suited for overlapping clusters?
They assign probabilities to multiple clusters instead of picking just one.