07 - Convolutional Neural Networks Flashcards

1
Q

What is a CNN?

A

Convolutional Neural Network.
A CNN is a net that has one or more convolutional layers.
In a convolution one layer is one filter. One filter is one feature map - one channel response map.
CNNs perform corsscorrelation. We rearrange the pixels in the possible areas to be vectors and do the same with the kernel so we just have two vectors that have to be multiplied (dot product).
A single output pixel is computed: S(i,j)=Sum_m Sum_n I(i+m,j+n)K(m,n)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Do CNNs have full or sparse connectivity?

A

Sparse. We talked about full connectivity in earlier lectures, but convolutional nets are sparse bc the filter is always only connected to parts of the input at one time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Explain the concept of receptive fields.

A

As we go through higher layers, the individual neurons “see” more, because they see through the lower layers, they are dependend on what the pixels they look at looked at. Thus the higher layers learn more abstract concepts.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Explain CNN filters.

A

CNN layers are filters. Early layers in CNN are used to extract primitive features like edges.
The AlexNet (first successfull CNN) example showed many simple filters. B/W ones that can detect filters for edges in different orientations, color gradient filters and pattern filters.
On a higher level Filters that are not just good for edges and other basic patterns, but eg numbers & letters, houses, upper bodies/heads etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Explain padding and stride

A

Normally when using a filter with a kernel size we get some shrinkage, if the convolution only happens inside of the input data.
To keep the size, padding is used. If padding with size 1, the border pixels are often just copied to the padded pixels.

If you have very big inputs (images with some megapixels) then don’t think about padding, bc it will take maaaany layers before downsampling becomes a problem.

Padding shouldn’t hurt classification. We expect the net to be able to see pixels are repeated. If you have larger kernels (eg 5x5 you might want to pad with 2).

Padding methods:
- The easiest are zero padding and average padding (filling all new pixels with zeros or the pixel average).
- Other options are repeated padding or mirror padding.

Stride:
- spacing between filter locations.
- If you want to downsample quickly a bigger stride (2 or 3) is used. There still is overlap but just very little.
- Can make sense but be careful with it.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Pooling

A

Pooling is used for data reduction and parameter free.

Max pooling:
- out of the kernel values in the layer before, take the maximum value
- Max pooling in it self does not reduce data yet
- No training or weights, it always just takes max every time
- There is also average pooling and min pooling but max pooling usually gives the best results.
- Downsampling with Max pooling also sharpens the image

Strided Pooling:
- Similar to strided convolutions
- The stride controls how much to downsample, the pool size controls the locality

When pooling, you get a small variance shift, bc you do not know where that max value comes from inside the pooling area. This is good because it gives the net some noise tolerance ( we want stuff to be classified no matter where exactly it is in the image).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly