chapter 7 Flashcards
(12 cards)
AlexNet
is a convolutional neural network for image classification. It consists of five convolutional layers followed by three fully connected layers.
CIFAR-10 dataset
dataset consists of 60,000 training images and 10,000 test images, each belonging to one of the ten categories airplane, automobile, bird, cat, deer, dog, frog, horse, ship, and truck. All are color images.
Translation Invariance
one of the most important characteristics of CNN, that the network will be able to recognize the target in any position even if it shifted or rotated. it is achieved by employing weight sharing between neurons as well as making them sparsely connected.
Translation
is a geometric transformation, it changes the location of an object without changing its shape.
The topology of a convolutional layer
- The neurons are arranged in three dimensions.
- Two of the dimensions correspond to the width and height.
- The neurons are grouped into channels or feature maps in a third dimension.
- There are no connections between the neurons within the same convolutional layer
- all the neurons within a single channel have identical weights (weight sharing), but they will receive different input values.
Convolutional kernel
The operation that each neuron in a convolutional layer implements.
Receptive field
The region of pixels from which a neuron receives inputs.
Stride
is the number of pixels shifts over the input matrix (image).
padding
The process to adding some extra pixels with zeros outside the image.
A convolutional layer
consists of multiple channels or feature maps. All
neurons within the same channel share weights.
Sparse connections
- reduce the total number of weights.
- reduce the number of computations.
- reduce the number of weights to store.
- reduce the number of weights to learn
Weight sharing
reduces the number of unique weights and thereby reduces the number of weights to store and to learn.