Week 6 - Deep Learning & CNNs Flashcards

(17 cards)

1
Q

What are the differences between MLPs and CNNs?

A

MLP:
- Wide application scenario, not just images
- Neurons are fully connected, but can’t scale well to large size data such as images

CNN:
- Neurons are arranged in ‘3D’, each neuron is only connected to a small region of a previous layer
- Typical CNN structure: Input-Convolution Layer-Activation Layer-Pool Layer-Fully Connected layer-Output Layer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the four parameters in a Convolution Layer?

A
  • Input size (W)
  • Filter Size (F)
  • Zero Padding (P)
  • Stride (S)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the equation used to calculate the output volume size from a Convolution Layer?

A

(W - F + 2P) / S + 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the primary aim of a Convolution Layer?

A

Convolution layers are filter banks performing convolutions with the learned kernels/masks.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the main purpose of a Pooling layer?

A

A Pooling layer aims to pool the results from many convolution layers, which reduces the resolution of the filter outputs. As such, subsequent convolutional layers therefore access larger areas of the image.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the effect of the Pooling Layer?

A

It reduces the spatial size of the representation and reduces the amount of parameters, effectively down-sampling the input to increase the receptive field size.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the four types of activation functions?

A
  • Sigmoid
  • Tanh
  • ReLU
  • Leaky ReLU
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the properties of Sigmoid?

A
  • Range of values is [0, 1]
  • Saturates and kill gradients, with small gradients at regions of 0 and 1
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the properties of Tanh?

A
  • Has a zero-centred range between [0, 1]
  • Saturates and kill gradients, with small gradients at regions of 0 and 1
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the properties of ReLU?

A
  • Solves vanishing/exploding gradient problems
  • Simple to calculate
  • Some neurons can be ‘dead’ with negative input
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the properties of Leaky ReLU?

A
  • Overcomes the dying neuron problem
  • Performance is not consistent
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the function of the Fully Connected Layer with Softmax?

A

Fully Connected Layer is fully connected to all activations in the previous layer
The Softmax function converts the prediction range to the range of [0, 1] for each class

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How does a Softmax function work?

A

A softmax function takes a set of numbers as an input, and outputs a probability distribution spread over a set of k classes., which gives us a probability for each class in the range of [0, 1]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a key property of the Softmax function?

A

The probabilities produced all sum to 1 across the classes. Example:
Class 1 = 0.4, Class 2 and 3 = 0.3
Therefore 0.4 + 0.3 + 0.3 = 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the typical architecture for CNNs in Classification problems?

A

Typical block:
- Normalisation Layer
- Filter Bank
- Non-linear
- Feature Pooling

After multiple blocks of the above:
- Classifier

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What makes a CNN effective?

A
  • Increased depth
  • Smaller filters at the lower levels
  • More convolutional than fully connected layers
17
Q

What is a hierarchical model in terms of CNNs?

A

A hierarchical model is a type of CNN which learns details at the pixel level, and then with each block ‘zooms out’ and finds less ‘zoomed in’ details. Example:
First stage looks at pixel-level details, then the second stage looks at edges. Third stage then looks at object parts i.e. combination of edges, before the final stage looks at object models