Chapter 14 Flashcards

(25 cards)

1
Q

What is the primary motivation for using CNNs instead of dense neural networks for vision tasks?

A

Dense networks are computationally expensive for images; CNNs reduce complexity while maintaining spatial structure.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What biological structure inspired CNNs?

A

The architecture of the human visual cortex.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a local receptive field in the visual cortex?

A

A small region of the visual field that a neuron responds to.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are convolutional layers used for in CNNs?

A

To detect local patterns in input data, such as edges or textures.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the function of pooling layers in CNNs?

A

To reduce spatial dimensions and promote translation invariance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is zero-padding and why is it used?

A

Adding zeros around an image to maintain the output size after convolution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What does stride mean in the context of CNNs?

A

The number of pixels the filter moves across the input during convolution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What happens when stride > 1?

A

The output feature map becomes smaller.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a filter (or kernel) in a CNN?

A

A small matrix of weights used to scan and extract features from the input.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is a feature map?

A

The output produced by applying a filter across the input.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Why are multiple filters used in convolutional layers?

A

To detect different features in the input image.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How is an image represented in TensorFlow for CNN input?

A

As a 3D tensor: [height, width, channels].

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the types of padding in CNNs?

A

“Same” (with zero-padding) and “Valid” (no padding).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the memory challenge with convolutional layers?

A

High memory usage due to many parameters and computations per feature map.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are the benefits of pooling layers?

A

Reduced computation, fewer parameters, and improved invariance to translations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is depthwise pooling?

A

Pooling across feature maps (depth dimension) to summarize feature activations.

17
Q

What is a typical CNN architecture structure?

A

Multiple convolution-ReLU-pooling blocks followed by fully connected layers.

18
Q

What is the trend of kernel sizes and feature maps in deep CNNs?

A

Larger kernels at lower layers; smaller kernels and more feature maps at higher layers.

19
Q

What is transfer learning in CNNs?

A

Using pretrained CNN layers for a new task by retraining only the final layers.

20
Q

What is ResNet and how does it improve CNN performance?

A

A deep CNN with residual (skip) connections to solve vanishing gradient problems.

21
Q

What is YOLO used for?

A

Real-time object detection by predicting multiple bounding boxes and class probabilities in one forward pass.

22
Q

What metric is used for measuring bounding box accuracy?

A

Intersection over Union (IoU).

23
Q

What is semantic segmentation?

A

Classifying each pixel in an image to determine the class of the object it belongs to.

24
Q

What is the difference between object detection and semantic segmentation?

A

Detection identifies bounding boxes; segmentation assigns a class to each pixel.

25
What is the purpose of upsampling layers in semantic segmentation?
To recover spatial resolution lost during pooling and convolution.