Week 6 - Deep Learning & CNNs Flashcards
(17 cards)
What are the differences between MLPs and CNNs?
MLP:
- Wide application scenario, not just images
- Neurons are fully connected, but can’t scale well to large size data such as images
CNN:
- Neurons are arranged in ‘3D’, each neuron is only connected to a small region of a previous layer
- Typical CNN structure: Input-Convolution Layer-Activation Layer-Pool Layer-Fully Connected layer-Output Layer
What are the four parameters in a Convolution Layer?
- Input size (W)
- Filter Size (F)
- Zero Padding (P)
- Stride (S)
What is the equation used to calculate the output volume size from a Convolution Layer?
(W - F + 2P) / S + 1
What is the primary aim of a Convolution Layer?
Convolution layers are filter banks performing convolutions with the learned kernels/masks.
What is the main purpose of a Pooling layer?
A Pooling layer aims to pool the results from many convolution layers, which reduces the resolution of the filter outputs. As such, subsequent convolutional layers therefore access larger areas of the image.
What is the effect of the Pooling Layer?
It reduces the spatial size of the representation and reduces the amount of parameters, effectively down-sampling the input to increase the receptive field size.
What are the four types of activation functions?
- Sigmoid
- Tanh
- ReLU
- Leaky ReLU
What are the properties of Sigmoid?
- Range of values is [0, 1]
- Saturates and kill gradients, with small gradients at regions of 0 and 1
What are the properties of Tanh?
- Has a zero-centred range between [0, 1]
- Saturates and kill gradients, with small gradients at regions of 0 and 1
What are the properties of ReLU?
- Solves vanishing/exploding gradient problems
- Simple to calculate
- Some neurons can be ‘dead’ with negative input
What are the properties of Leaky ReLU?
- Overcomes the dying neuron problem
- Performance is not consistent
What is the function of the Fully Connected Layer with Softmax?
Fully Connected Layer is fully connected to all activations in the previous layer
The Softmax function converts the prediction range to the range of [0, 1] for each class
How does a Softmax function work?
A softmax function takes a set of numbers as an input, and outputs a probability distribution spread over a set of k classes., which gives us a probability for each class in the range of [0, 1]
What is a key property of the Softmax function?
The probabilities produced all sum to 1 across the classes. Example:
Class 1 = 0.4, Class 2 and 3 = 0.3
Therefore 0.4 + 0.3 + 0.3 = 1
What is the typical architecture for CNNs in Classification problems?
Typical block:
- Normalisation Layer
- Filter Bank
- Non-linear
- Feature Pooling
After multiple blocks of the above:
- Classifier
What makes a CNN effective?
- Increased depth
- Smaller filters at the lower levels
- More convolutional than fully connected layers
What is a hierarchical model in terms of CNNs?
A hierarchical model is a type of CNN which learns details at the pixel level, and then with each block ‘zooms out’ and finds less ‘zoomed in’ details. Example:
First stage looks at pixel-level details, then the second stage looks at edges. Third stage then looks at object parts i.e. combination of edges, before the final stage looks at object models