Impact of depth and design choice Flashcards

1
Q

What does the Universal approximation theorem say?

A

A single hidden layer neural network with any “squashing” activation function and with a linear output unit can approximate any continuous function arbitrarily well, given enough hidden units.

In the worst case, an exponential number of hidden units (possibly
with one hidden unit corresponding to each input configuration
that needs to be distinguished) may be required. (2^n parameters)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the needs for Supervised training?

A
  • labeled training set
  • vector of model parameters
  • loss function L(fθ(x), y)
  • Training = find θ that minimizes the total loss on the training set
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the needs for unsupervised training

A
  • unlabeled training set
  • vector of model parameters
  • loss function L(fθ(x))
  • Training = find θ that minimizes the total loss
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a detection task? What output activation function is used?

Give an example

A

Only 2 possible classes : 0 and 1. The thing is either detected or not.
Sigmoid

CAPTCHA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is a classification task? What output activation function is used?

Give an example

A

3 or more classes, similar to detection
Softmax activation function

What kind of animal is in the picture: leopard, egyptian cat, jaguar, ..

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a regression task? What output activation function is used?

Give an example

A

the model is trained to learn the relationship between input variables and a continuous target variable.
Linear activation

estimating a house location, size, etc

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Define what an autoencoder is

A

Neural network trained to predict its input
It is unsupervised learning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How does the autoencoder work?

A

Consists of two parts:
* an encoder function h = f (x)
* a decoder function xˆ = r(h) such that xˆ ≈ x
The hidden activations h provide a nonlinear representation of the input called an embedding.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Define what an undercomplete autoencoder is

A

Encoder where the embedding h
has fewer dimensions than the input x.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the perk of denosing the autoencoder

A

Forces the autoencoder to learn to undo the corruption, forcing it to learn saliant features
obtain embeddings h whose dimension is
bigger than the data x.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Explain Synthetic data generation

A

compute and draw samples from p(x).
For discrete data, treat as a series of classification tasks:
p(x) = p(x1) × p(x2|x1) × . . . p(xn|x1, . . . , xn−1).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What can you use Synthetic data generation for?

A
  • Language modeling and text generation
  • Augment existing data
  • Testing and validaiton
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly