CNN Flashcards

Question 1

Q

Name four components that are typically contained in a block of CNN.

Answer

A

Convolutional Layer: Performs feature extraction using filters (kernels).
Activation Function: Introduces non-linearity (e.g., ReLU).
Pooling Layer: Downsamples the feature maps to reduce spatial dimensions.
Batch Normalization Layer: Normalizes activations to stabilize training

Question 2

Q

2 advantages of a conv layer compared to a fully connected layer

Answer

A

More flexibility in learning. Enables high-dimensional inputs such as image data

Question 3

Q

Explain what a feature map is.

Answer

A

A feature map is a 2D representation of learned features extracted from the input data by convolutional filters in CNN.

Question 4

Q

Name two advantages of deep networks compared to a network with one wide layer

Answer

A

The multiple layers in deep neural networks allow models to become more efficient at learning complex features and performing more intensive computational tasks

Question 5

Q

Why are pooling layers in NN used? Pls name at least 2 reasons. How many learnable params does max-pooling layer have?

Answer

A

Pooling layers are used to reduce the dimensions of the feature maps (Downsampling) and create spatial invariance by summarizing local features, enabling the network to recognize patterns irrespective of their exact location in the input. A Max-pooling layer has no learnable parameters

Question 6

Q

Name two pooling methods

Answer

A

Max-pooling (Take the maximum value from the kernel)
Average (Sum of the values from the kernel)

Question 7

Q

Name one use-case of 1x1 convolutions

Answer

A

One use-case of 1x1 convolutions is in the “bottleneck layer” (e.g in Inception from GoogleNet)

Question 8

Q

Draw and explain the Google Inception Module

Answer

A

architecture that uses parallel convolutions of different sizes (1x1, 3x3, 5x5,max_pooling) to capture multi-scale features efficiently, reducing computational complexity using the bottle neck (1x1 layers) to reduce the number of parameters

Question 9

Q

Draw and label the naive inception module and the inception module with dimension reduction from the GoogleNet architecture.

Answer

A

Naive: Input ->(1x1; 3x3; 5x5; Max_pooling ) -> Output

With bottle neck: nput ->(1x1;1x1 + 3x3; 1x1 + 5x5; Max_pooling + 1x1) -> Output

Question 10

Q

Name a network architecture or building block which leverages 1x1convolutions

Answer

A

Incepction, from GoogleNet

Question 11

Q

Explain how the inception module is modified for the I3D architecture. Name one advantage that emerges from this

Answer

A

In the I3D architecture, the Inception module is modified by adding 3D convolutions to capture spatio-temporal information from video frames. One advantage is that allows to recognize actions and activities in videos more effectively

Question 12

Q

What were the changes in VGG compared to AlexNet?

Answer

A

the use of deeper convolutional layers with smaller 3x3 filters and a more uniform architecture with multiple stacked convolutional layers, leading to improved accuracy and better generalization.

Question 13

Q

Draw and explain a basic ResNetbuilding block. Name 2 advantages of Resnet blocks over serial conv blocks

Answer

A

A ResNet block, consists of an identity shortcut connection and a stack of convolutional layers. The residual block adds the original input to the output of the convolutional layers, enabling the network to learn the residual (difference between input and output). This mechanism facilitates the training and enable to work with much deeper neural networks.

Question 14

Q

What are residual block? What are they good for

Answer

A

Residual blocks are building blocks that use shortcut connections to learn the residual (difference) between input and output. This mechanism facilitates the training and enable to work with much deeper neural networks.

Question 15

Q

Name three properties of the ResNet architecture that were not present in the AlexNet architecture.

Answer

A

Shortcut Connections, Deeper Architecture, Residual Learning

CNN Flashcards

(15 cards)