CNN and RNN Flashcards
(30 cards)
What is deep learning?
A subset of machine learning using multi-layered neural networks to learn hierarchical representations from data.
What are the three historical waves of neural networks?
Cybernetics (1940–1970), Connectionism (1980–2000), and Deep Learning (2006–present).
What was the major contribution of the 1986 backpropagation paper?
An efficient algorithm to compute gradients in multi-layer neural networks, enabling end-to-end training.
What is the purpose of LSTM networks?
To handle long-term dependencies in sequences by mitigating vanishing gradients through gated memory units.
What problem did LSTM solve?
Vanishing gradients in standard RNNs during long sequence learning.
What is LeNet-5?
An early convolutional neural network developed in 1998 for handwritten digit recognition.
What were the key features of LeNet-5?
Convolutional layers, pooling, and fully connected layers to achieve spatial invariance.
What was AlexNet?
A deep CNN that won the 2012 ImageNet competition and popularized deep learning.
What innovations did AlexNet introduce?
ReLU activation, GPU training, data augmentation, and dropout regularization.
What is supervised learning in the context of deep learning?
Learning from labeled input-output pairs to minimize a loss function.
What are CNNs used for?
Visual recognition tasks like image classification, object detection, and segmentation.
What are the key ideas behind CNNs?
Local receptive fields, weight sharing, and hierarchical feature extraction.
What makes CNNs efficient for images?
They reuse the same filters across space and reduce spatial dimensions via pooling.
What does a typical CNN architecture consist of?
Convolutional layers, ReLU activation, pooling layers, and fully connected layers.
What is a recurrent neural network (RNN)?
A neural network designed to process sequences by maintaining hidden state across time steps.
What are common applications of RNNs?
Text generation, speech recognition, image captioning, and video analysis.
Why do RNNs struggle with long sequences?
Because gradients can vanish or explode over time, making learning difficult.
What are LSTM and GRU used for?
They are variants of RNNs that solve the vanishing gradient problem with gating mechanisms.
What is image captioning?
A task where a model generates a natural language description of an image.
How does image captioning work?
A CNN encodes the image and an RNN decodes it into a sentence.
What are some failure modes of image captioning systems?
Incorrect object identification, hallucinated relationships, or vague descriptions.
What is the function of softmax in a classification network?
It converts the final layer’s outputs into class probabilities.
What role do benchmarks like ImageNet and COCO play in deep learning?
They provide standardized datasets and evaluation metrics for comparing models.
What is the significance of the ImageNet 2012 competition?
It demonstrated the power of deep CNNs with AlexNet outperforming previous methods by a large margin.