CNN Flashcards
(25 cards)
What is the main advantage of CNNs over traditional ML for image tasks?
They learn features directly from raw images, removing the need for manual feature engineering.
What are the three main types of layers in a CNN?
Convolutional layers, pooling layers, and fully connected layers.
What does a convolutional layer do?
Applies filters that extract local patterns from the input data.
What is a filter (or kernel) in a CNN?
A small matrix that slides across the input to detect specific patterns.
What operation does a filter perform on the input?
A dot product between the filter and a local region of the input.
What does the stride hyperparameter control in a convolution?
How many pixels the filter moves at each step.
What is the effect of increasing stride in a CNN?
Reduces the spatial size of the output feature map.
What does padding do in convolutional layers?
Adds borders to the input to control output dimensions.
What is a feature map in a CNN?
The output produced by applying a filter across an image.
What kind of features do early CNN layers learn?
Low-level features like edges and textures.
What kind of features do deeper CNN layers learn?
High-level features like object parts or full shapes.
What is the purpose of pooling layers?
To reduce spatial dimensions and make features more robust to translation.
What is max pooling?
A pooling method that takes the maximum value in each region.
What are the downsides of excessive pooling?
Loss of spatial detail and important information.
What are the benefits of pooling?
Dimensionality reduction and translation invariance.
What does a fully connected layer in a CNN do?
Connects all activations to final outputs, used for classification or regression.
What happens before data reaches the fully connected layer?
It is flattened from 2D feature maps into a 1D vector.
What kind of activation function is typically used in CNNs?
ReLU (Rectified Linear Unit).
What determines the size of the receptive field in a CNN?
The size of the filters and depth of the network.
What does a stride of 2 mean in a convolutional layer?
The filter moves 2 pixels at a time, reducing resolution more quickly.
What is one use of CNNs beyond image classification?
They can be used on time series (1D) and video data (3D convolutions).
What does ‘transfer learning’ typically modify in a CNN?
The fully connected layers.
What is the typical input to a CNN?
An image represented as a matrix of pixel values.
What is the role of filters in learning?
Each filter detects a specific visual pattern relevant to the task.