08 - AutoEncoders Flashcards

1
Q

What is an Autoencoder?

A

An autoencoder has two parts:

  1. Encoder f - compresses input x into output code/latent representation h
  2. Decoder g: tries to reproduce x using h (so optimally r ~ x)
    r=g(h)=g(fx)

x —f–>h —g—>r

The loss is generally L(x,g(f(x))=L(x,r)
- use log loss to predict bw pixels (sigmoidal outputs)
- use mean square to predict real valued pixels (rgb, linear outputs)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are Overcomplete and Undercomplete autoencoders?

A

Overcomplete Autoencoder:

  • more components in latent code than in input code
  • this can be difficult to train and g(f(x) can end up just becoming the identity function (which we do not want)
    • that’s why we need the “special tricks” like regularization and others that we’ll get to later

Undercomplete Autoencoder:

  • generally what we want/more interesting
  • compression/classification without labelling, no need for annotating data
  • non linear PCA machine
  • the algorithm is forced to learn th ebest features in the data
  • This could eg make sense for storing, you store the encoded versions and decode/decompress when you need it (saves memory space)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

SAE - Sparse Autoencoder

A

SAEs are an alternative view on MAP inference, where we use a prior of the hiddens instead of the normal prior on the weights

$L(x,r)+\Omega(h)$

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

DAE - Denoising Autoencoder

A

For neural nets it is often good to add noise to the input and retain the labels, bc it makes the net much more stable towards noise.

DAEs are trained by corrupting the input. The output is kept clean. L(x,g(f(x_tilde)))

An example frm class showed that wthout noise the filters are way overfit, while eg images with 50% filters for edges in a specific angle.

He also had the example with the manifold. The normal AEs do not know what to do with data off the manifold, DAEs know how to map them onto the manifold

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

CAE - Contractive Auto Encoder

A

Contractive AEs try to achive insensitivity to input perturbations (small changes in the training dataset) directly on the hidden code. A new penalty term is added to the loss function of the auto encoders

→ Square loss on the jacobian of the hidden. it is called the penalty on the speed of change of h

So the CAEs try to learn denoising without putting noise on the images at first, but it can be argued that it is easier to just add noise.

Differences between CAE and DAE:
- CAE encourages robustness of representation f(x), while DAEs encourage robustness of reconstruction (which only partially increases the robustness of representation)
- DAE increases its robustness by stochastically training the model for the reconstruction, whereas CAE increases the robustness of the first derivative of Jacobian matrix.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

VAE - Variational Auto Encoder

A

VAEs are trained to produce a latent mean and variance.

The encoder does not output a single value to describe each latent state attribute, it describes a probability distribution for each latent attribute.

Application examples:
- data compression
- synthetic data creation

when trained properly, the reconstruction is good at reproducing, and the latent mean/variance are gaussian
VAEs interpolate to make the space inbetween plausible (otherwise it is just undefined bc we do not have any data for that region)
L1 - V shape, brings it to 0
L2 - U shape, brings close to 0

Audio data example
- They gave it two “endpoints” of the gaussians, as in two data samples, then we get the interpolation between the two endpoints

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Convolutional Autoencoder

A

Convolutional AEs (we might also want to use transformers instead of convolutions but weĺl get to that another time).

CNNs are a good idea to use for better autoencoding of images.

To be able to decode, transpose convolution or deconvolution and unpooling is used.
The upscaling happens by having a big padding. It is learnable so it does not just add random pixels at the border.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly