Quiz 5 - Module 4 Flashcards

1
Q

GANs involve __ density modeling

A

implicit

  • generate samples from the model p(x)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

GAN input

A
  • Generator
    • Vector of random numbers, normal (mu, sigma)
  • Discriminator
    • minibatch
      • p(x) fake image
      • real image
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Gan output

A
  • Discriminator
    • real or fake
  • Generator
    • p(x)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Generator role

A
  • update weights to improve realism of generated images
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Discriminator role

A
  • update weights to better discrimate
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Game theory problem for GANs

A
  • Mini-max Two Player Game
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

GAN Objective

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

GAN Generator Objective

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

GAN Discriminator Objective

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

The ___ part of the GAN objective does not have good gradient properties

A

Generator

  • High gradient when D(G(z)) is high (ie. discriminator is wrong)
  • We want to improve when samples are bad
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Alternate Objective for GAN Max-Max Game

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

GAN Drawbacks

A
  • No explicit model for distribution
  • training can be unstable
  • High-fidelity generation heavy to train
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

VAE involve __ density modeling

A

explicit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

VAE input

A
  • Encoder
    • Input is image X
  • Decoder
    • sample Z from simple distribution
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

VAE Output

A
  • Encoder
    • Parameters of a probability distribution (Z)
      • mu and sigma
  • Decoder
    • Parameters of a probability distribution
      • Mu and sigma of Gaussian
      • For multi-dimensional version, output diagonal covariance
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

VAE Optimization

A
  • Two parts
    • KL Divergence
      • Variational lower bound (elbo)
    • Reconstruction Loss
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

T/F: Variational AutoEncoders are differential

A

True - with caveat

  • Sampling action is not differentiable (stochastic)
  • Need to use reparameterization trick to put stochastic sampling into a separate variable (epsilon) that is not in backprop.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

VAE Reconstruction Loss

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

VAE Distribution Loss

A

The loss associated with the VAE Distribution diverging from the normal distribution (mu = 0, sigma = 1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Gan Discriminator wants ___ (minimize/maximize)

E[log D(x)] + E[log (1 - D(G(z)))]

A

maximize

  • Discriminator wants to output a 0 for D(G(z)) to indicate that the generated image is fake (0) not real (1)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Gan Generator wants _____ (minimize/maximize)

E[log D(x)] + E[log (1 - D(G(z)))]

A

minimize

  • The generator wants the discriminator to be wrong
  • Ie. wants the discriminator to classify D(G(z)) as 1
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

The ___ part of the objective for GAN does not have good

A

Generator

  • High gradient when D(G(z)) is high (discriminator, wrong)
  • We want it to improve when samples are bad (discriminator is right)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Semi-supervised learning data type

A
  • Small amount of labeled data
  • Larger amount of unlabeled data
24
Q

Different ideas for training in semi-supervised environment

A
  • simple idea
    • learn model on small label data
    • make predictions on unlabeled data, add as new training, repeat
  • co-training
    • prediction across multiple views
25
Q

Fixed match (pseudo-labeling)

A
  • Unlabeled data example
    • Weakly augment
      • Make prediction, generate pseudo-label
      • throw out cases below threshold
    • Strongly augment
      • Make prediction, use pseudo-label as ground truth
26
Q

Pseudo-labeling (in practice)

A
  • Labeled examples (feed directly to model)
  • Unlabeled examples
    • Combined
      • Weakly augment
      • Strongly augment
  • Losses
    • Cross Entropy (Labeled Data)
    • Cross Entropy (strongly augmented unlabeled data using weakly augmented pseudo-label as ground truth)
27
Q

Label propagation

A

Learn feature extractors and apply to unlabeled data. Get unlabeled data labeled similarly to labeled data in clusters (like KNN)

28
Q

Few shot learning data

A
  • Base set of data
    • lots of labels
  • New set
    • very few labels (1 - 5 examples per category) in new categories
  • transfer learning
29
Q

Approaches to few shot learning

A
  • Fine-tuning
    • train classifier on base classes
    • freeze feature extractor
    • learn classifier for new classes (during “query” time)
  • Simulate (N-Way K-Shot Tasks)
    • Meta-training
    • Better at making train reflect what will happen during test
30
Q

Classifier useful in the few-shot fine-tuning case

A
  • cosine (similarity-based)
    • instead of linear layer
    • unit comparison (A.B) / (norm (A) norm(B) )
  • normalized (unit-norm) comparison may discriminate a small number of classes better since it focuses on an angular difference
31
Q

Meta-Training

A
  • useful for few-shot learning
  • makes training better reflect test (simulate smaller tasks)
  • N-Way K-Shot Tasks
    • N - number of categories
    • K - examples per category
  • Can pre-train features on held-out base classes
32
Q

Meta-Learner methods

A
  • Meta-Learner LSTM
    • want to learn gradient descent
      • update rules
      • param initialization
      • adaptive LR, weight decay to reduce overfit
    • gradient descent update looks like LSTM update
  • Model-agnostic meta learning (MAML)
    • want to learn parameter initialization
    • normal gradient descent
33
Q

self-supervised data

A
  • no labels at all
34
Q

Autoencoders

A
  • Low dimensional embedding between an encoder and a decoder
  • Loss
    • Minimize difference (MSE)
35
Q

Surrogate tasks for self-supervised learning

A
  • reconstruction
  • rotate image
  • colorization
  • relative image patch location (jigsaw)
  • video: next frame prediction
  • instance prediction
36
Q

Colorization

A
  • self-supervised task
  • input
    • grayscale
  • output
    • color
  • loss
    • MSE
37
Q

jigsaw puzzle

A
  • self-supervising task
  • input: image patches
  • output: prediction of discrete image pach location relative to center
  • loss: cross-entropy classification (which position)
38
Q

rotation prediction

A
  • input: image with various rotations
  • output: prediction rotation amount
  • objective: cross-entropy classification
39
Q

Evaluation of self-supervised learning

A
  • train the model with surrogate task
  • extrack the convnet (encoder part)
  • transfer to actual task
    • use to initiaize model of another supervised learning task
    • use it to extract featrures for learning a separate classifier (NN, SVM)
    • often classifier is limited to linear layer and features are frozen
40
Q

Instance discrimination

A
  • Positive Example
    • 2 augmentations (same image)
  • Negative Example
    • Augmented
  • Feed positive and negative examples to classifier CNN
  • Loss
    • contrastive loss
    • dot product (similarity) between augmentation 1 and positive and negative examples
41
Q

Contrastive loss types

A
  • end-to-end
    • use all other examples as negatives in mini batch
  • memory bank
    • store negatives across iterations (Queue)
      • from previous mini-batches
    • don’t have to redo feature extraction
    • no extra feature extraction needed (stored)
  • Momentum encoder
    • exponential average of moving weights
    • helps avoid stale weights issue in memory bank
42
Q

Reinforcement Learning

A

Sequential decision making in an environment with evaluative feedback

43
Q

Signature challenges in reinforcement learning

A
  • evaluative feedback
    • need trial/error to find the right action
  • delayed feedback
    • actions may not lead to immediate reward
  • non-stationary
    • data distribution of visisted states changes when the policy changes
  • fleeting nature of time and online data
44
Q

Markov decision process

A

(S, A, R, T, gamma)

  • state
  • action
  • distribution of rewards R(s, a, s’)
  • transition probability T(s, a, s’)
  • gamma discount factor
45
Q

Markov property

A

Current state completely characterizes state of the environment. Assume most recent observation is a sufficient statistic of history

46
Q

What do we assume is unknown about an MDP in RL?

A
  • Transition probability distribution
  • Reward distribution
47
Q

Value Iteration

A
48
Q

Bellman Optimality Equation (value)

A
49
Q

Q-Iteration is the same as value iteration except ___

A

it loops over actions as well as states

50
Q

Parts of policy iteration

A
  • Policy Evaluation
    • Compute V_pi (similar to value iteration)
  • Policy Refinement
    • Greedily change actions as per V_pi at next steps
51
Q

Why choose policy iteraton over value iteration?

A

Pi often converges to Pi* much sooner than V to V*

52
Q

Deep Q-Learning

A
  • Parameterized Q-function from data {(s, a, s’, r)} for N data points
  • Linear function approximators
    • Q(s, a; w, b) = waTs + ba
  • Loss
    • MSE
    • (Qnew(s,a) - (r + gamma * maxaQold(s’, a)) )2
    • Qnew - predicted Q-value
    • Qold - target Q-value
  • For stability
    • Freeze Qold and update Qnew parameters
    • Set Qold to Qnew at regular intervals
53
Q

Deep Q-Learning - Correlated Data Problem

A
  • Samples are correlated -> high variance gradients -> inefficient learning
  • Current Q-Network parameters determines next training sample -> can lead to bad feedback loops
  • Resolution?
    • replay buffer that stores transitions
    • update replay buffer as game (Experience) epuisodes are played, older samples discarded
    • Train Q-network on random minibatches of transitions from the replay memory, instead of consecutive samples
    • large the buffer, lower the correlation
54
Q

What are the key steps of the Deep Q-Learning Algorithm

A
  • Epsilon greedy action selection
    • select random action with probability epsilon or max Q* action
  • Experience replay
    • store transition in replay buffer
    • sample random minibatch of transitions from buffer
  • Q update
    • perform gradient descent
55
Q

Derive the policy gradient

A
56
Q
A