Deepfake Flashcards

(26 cards)

1
Q

How do you use training to remove noise in an image?

A
  1. Use original, high resolution image as ground truth
  2. Corrupt the original and use as input
  3. Run neural netwrok to generate a prediction
  4. Compare prediction to ground truth
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Origin of generative AI

A

trying to de-noise images

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What type of learning is generative AI?

A

Unsupervised learning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is downsampling and how is it used?

A

Reducing the number of pixels (used for corrupting images) – You get the mean of all pixels within a block and make it the new color in one pixel

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Is 8k resolution likely to happen

A

Predicts 8k won’t happen because there isn’t a need for it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Image impainting

A

The task of filling in holes of an image

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Explain the process for training AI with image impainting

A

-Training using a ground truth image if the actual image, input would be blocked image
-Corrupt the image by adding white blocks
-The model will never see the ground truth but you use the ground truth to compare the prediction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Reconstruction error

A

Get the error of each individual pixel and then add all the pixel wise errors together

When you add the differences together, you get 0
The way to solve this problem is to square the differences so they’ll never be negative – this new error rate is the message sent back to the neural network so it can adjust its weight

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How would you corrupt an image for training a neural network to remove watermarks

A

Collect a non-watermarked image as ground truth and corrupt it by adding a warter mark for input image

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Explain the Midjourney zoom out feature and how it is trained.

A

-the model zooms out of an image by adding more to the original image

Training:
-Collect a lot of images for ground truth
-Corrupt the image by zooming in (input)
-Run the input through the neural network
-Compare the output to the ground truth
-Calculate the error rate
-Model takes the error rate to adjust weights

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How does auto-encoding work?

A

-Neural network is wide on the outside and narrow in the middle
-The input and output layer have roughly the same number of neurons because you want as many pixels as possible in an image, which helps to generate a high resolution image

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

In terms of auto-encoding, what do you want to improve image resolution?

A

You’d want more neurons in the output layer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What does each neuron represent in auto-encoding

A

represents a pixel value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Why does the middle need to be narrow for autoencoding?

A

When it becomes narrow you’re forcing the model to learn what is the code of an image

Autoencoder forces the neuron network to compress an image into a short code, based on which the original image can be regenerated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is an autoencoder?

A

the combination of a discriminative network (encoding) and a generative network (decoding)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What does the code of an autoencoder include?

A

Latent attributes

E.g. encoding a face; the latent attributes may be the smile, skin tone, gender, beard, glasses, and hair color

17
Q

How might you edit a person’s smile (or something else) in an image?

A

You tamper with the code - specficially, in this case, the smile code

tampering with the code is how you get deepfake

18
Q

What is they key to training an autoencoder?

A

*To learn a disentangled representation

For example age may be entangled with gender

19
Q

Where do you put the encoder for neural networks?

A

From the input to the code you put the encoder, from the code to the output you put the decoder

20
Q

How has deepfake been used in Hollywood?

A

De-aging in Irishman (2019)- used deepfake tech to make the actor appear younger (alter age code)

Creating deepfakes by swapping the codes of different faces (facial identity is merely a combination of codes )

21
Q

How can AI be used to reconstruct of digitize voice?

A

-Can reconstruct voice or remove noise

-Every time you speak you change the air pressure, which hits the sensor and changes the pressure (when you speak it is recording numbers that represent air pressure)

22
Q

How might you build a neural netwrok to remove noise in audio?

A

Ground truth is actual recording. Distort it by adding noise. Based on the ground truth, the biases and weight attached to the number representing air pressure are altered
The key elements of the voice are condensed into code
Components might be pitch, frequency, accent, etc.

23
Q

Why do AI voices fail to connect with the listener?

A

-Lack of emotionality
-Monotone (This is an inaccurate stereotype; all things considered “human” can be based on statistics)

24
Q

How is AI image synchronized with voice?

A

Neural network where some neurons takes the text script and some neurons handle the image are combined into one input layer
Output layer has a bunch of neurons corresponding to both the image and the voice
Take original voice, extract the code, and change the voice or mouth code

E.g. removing a swear word from a movie scene, or changing the language

This is a very hard task; Has been significant progress since then. Still a bit rigid, but human image is very realistic

25
Why has deepfake’s social impact on society been limited
When things get close to real humans but aren't quite there, people hate it Y axis represents how much people enjoy the AI, X axis is how realistic is its The more realistic it becomes, people enjoy it more. But when it looks human but feels unnatural, people don’t like it
26
Uncanny valley meaning
When AI looks human but feels unnatural ## Footnote when it looks human but feels unnatural