Lecture 12 - Deep Learning Flashcards
(34 cards)
What are the three layers in CNNs?
Input layer
Hidden layer
Output layer
Do we need all the edges to be connected?
No, info can be shared, as you can represent small regions with few parameters, ie beak example
What is a CNN?
A neural network with some convolutional layers.
What is a Convolutional Layer?
- A convolution layer has a number of filters that does convolutional operations
- The filters are known as the parameters to be learned
- These are 3x3 filters with specific patterns
How does the convolutional layer (filter) work?
REFER TO SLIDES
What is the understanding of convolution using a line?
It basically uses a line, it detects the line with the maximum value
How does convolution extract and learn features?
REFER TO SLIDES
When is padding necessary in convolution?
Padding is used if you want the same output ie the original matrix is 6x6 but as you can see we get a 4x4 image, so we need to add padding if we want a 6x6 and then do the same procedure
How does convolution work for RGB images?
It’s the same thing for colour, except you will have 3 layers, 1 for each colour and each filter needs to be done 3 times (one for each colour)
What is the difference between Convolution and Fully Connected Layer?
- Convolution: We don’t take all the values we only focus on a small area
- Fully connected layer: all the features connected to the next neuron, so in the example the first neuron will get all 36 features (shown by X36)
What are shared weights?
REFER TO SLIDES
What is the entire CNN process?
- This is the process of CNN you start with the input image, you perform convolution and max pooling as many times are required, you perform flattening, this then gives you the full CNN
What is Max Pooling?
Max pooling is basically the feature map is split into even sections using a filter, and then the max value in each section is taken
- In other words we are keeping the most prominent features that are found through convolution using max pooling
What is average pooling?
There is also average pooling, which is taking all the values in a given sections adding it up and dividing it by the total number of points
Why do we use pooling (max pooling)?
- Max pooling is a form of subsampling and down sampling, as we saw before the image went from 4x4 (the feature map) to 2x2 (the sections created from the feature map)
○ This means we have fewer parameters
What are the ways CNN compresses a fully connect network?
Reduce the number of connections
Having shared weights on the edges
What are the steps for max pooling?
These are the full steps leading to smaller image
- We take the big image
- Perform convolution
- Perform max pooling
- Get smaller image (where each filter is a channel - the filter being the layers as you can see this is 2 x 2 x 2)
What is Relu?
- An activation function that occurs after each convolution
- What happens is we only keep the positive values
- Given by m(0, x) which basically means if value is not positive it becomes 0, if it is positive keep it
What is flattening?
Take the features and convert it into a column, which is then converted into the full connected feedforward network
What is the pipeline for Deep Learning-based Computer vision?
Input -> Deep Learning for feature extractionm description, classfication/regression/segmentation -> Output
What do CNN Filters Learn?
- They learn full objects i.e. faces, car, etc (high-level features)
- Parts such as nose, eyes, car parts, legs of chair (mid-level features)
- Edges (low-level features)
As you go down the layers become more fine grain
What is Normalisation?
Take the value multiply it by the mean and divide it by the standard deviatio
Why do we do batch normalisation?
- If inputs are not centered around 0
- If inputs have different scaling per element
What does batch normalisation do?
- Makes the inputs have zero mean and unit variance
- To this you apply the equation
○ Take the value multiply it by the mean and divide it by the standard deviation