Week 5 Flashcards
What is a deep NN? What is shallow NN?
Deep is > 3, typically»_space; 3 layers
Shallow is <=3 layers
Why question he’s of Deep NNs? Why should we use?
Shallow NN can appreciate any continuous non linear function arbitrarily well
Deep helps capture complexities
Unnecessarily confusing example of RNN
Because output of last hidden layer goes back into input of first hidden layer, in this example, hidden layer goes into itself
Difficulties with backprop
Vanishing and exploding gradients
Mitigate vanishing gradients
Use activation functions with non vanishing gradients
Mitigate exploding gradients
Batch normalisation
Activation functions with non vanishing derivatives
Better ways to initialise weights
How does back prop struggle with gradients in cost function
Variation in magnitude of gradient may occur between:
- different layers
- different parts of the cost function for a single neuron
- Different direction for a multi dimensional function
Momentum in back prop ? Useful for?
Adds moving average of previous gradients to current gradient (helps with plateaus and local minima)
Movement = negative of gradient + momentum
Adaptive learning rate and examples
Vary learning rate (for individual parameters) during training:
- increasing learning rate if cost is decreasing
- decreasing learning rate if cost is increasing
Eg AdaGrad or RMS Prop
Examples of back prop algorithms with adaptive learning rate and momentum
ADAM
Nadam
Motivate batch Normalisation
Different inputs to 1 neuron can have very different scales (depending on the neuron the input comes from)
Batch normalisation
Benefit of batch normalisation
Skip connections
Define CNN and motivate
Any NN in which at least 1 layer has a transfer function implemented using convolution/crosscorrelation
Motivated by desire to recognise patterns with tolerance to location
Weight sharing in CNNs, how?
Transfer function for CNN
Output of 1 neuron of CNN layer
Array of numbers, sometimes called feature map
Cross correlation formula
Convolution formula (the operation)
Mask for CNN