Image Generation Flashcards

(20 cards)

1
Q

What is the primary goal of a generator in a GAN?

A

To generate images that can deceive the discriminator

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Which architecture is commonly used as the generator in Pix2Pix GAN?

A

U-Net

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

In Conditional GAN (CGAN), how is the label information used?

A

Added to both the generator and discriminator

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Which problem in GAN training is characterized by the generator producing a limited variety of outputs?

A

Mode collapse

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the purpose of progressive growing of GANs?

A

To start with low-resolution images and gradually grow to high resolution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

In the diffusion model, what happens during the forward process?

A

Noise is added gradually to the image

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the role of CLIP in DALL-E 2’s architecture?

A

To encode images and texts into a shared embedding space

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

In Stable Diffusion, which model is responsible for turning compressed latent codes back into images?

A

Variational Autoencoder (Decoder part)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

In the training of the prior in DALL-E 2, the model learns to map:

A

Text embeddings to image embeddings

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does cross-attention enable in diffusion models like Stable Diffusion?

A

Controlling generation based on text prompts

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Which problems are commonly encountered when training GANs?
A) Mode collapse
B) Overfitting discriminator
C) Perfect convergence
D) Non-convergence

A

Mode collapse; Overfitting discriminator; Non-convergence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

In the Diffusion Process, the reverse process involves:
A) Removing noise gradually
B) Using a U-Net
C) Adding more noise each step
D) Predicting either clean images or noise

A

Removing noise gradually; Using a U-Net; Predicting either clean images or noise

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Which of the following techniques use cross-attention?
A) CLIP text-image matching
B) Stable Diffusion conditioning
C) GAN training without labels
D) DALL-E 2 conditioning

A

Stable Diffusion conditioning; DALL-E 2 conditioning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Stable Diffusion components include:
A) Autoencoder (VAE)
B) U-Net
C) Transformer Decoder
D) Text Encoder

A

Autoencoder (VAE); U-Net; Text Encoder

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

In GANs, the discriminator network is trained to:
A) Generate realistic images
B) Distinguish real from fake images
C) Provide gradients to the generator
D) Upsample noise

A

Distinguish real from fake images; Provide gradients to the generator

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Progressive growing of GANs helps by:
A) Speeding up training
B) Stabilizing the generator early
C) Reducing resolution at the end
D) Gradually increasing output resolution

A

Speeding up training; Stabilizing the generator early; Gradually increasing output resolution

17
Q

Which components are part of DALL-E 2’s generation pipeline?
A) Prior model
B) Diffusion model (Decoder)
C) LSTM text encoder
D) CLIP encoder

A

Prior model; Diffusion model (Decoder); CLIP encoder

18
Q

Benefits of latent space diffusion (as in Stable Diffusion) include:
A) Higher memory usage
B) Faster generation
C) Lower computation cost
D) High-resolution outputs

A

Faster generation; Lower computation cost; High-resolution outputs

19
Q

In Super-Resolution GAN (SRGAN), during training:
A) Low-resolution images are upsampled
B) GAN loss is used
C) Noise is directly added to high-res images
D) Discriminator distinguishes between HR and SR images

A

Low-resolution images are upsampled; GAN loss is used; Discriminator distinguishes between HR and SR images

20
Q

In the diffusion model reverse process, each denoising step depends on:
A) Previous denoising result
B) Original input image
C) Text embeddings (if conditioned)
D) Random noise injection

A

Previous denoising result; Text embeddings (if conditioned)