Chapter 15 Flashcards
(23 cards)
What type of data are RNNs designed to process?
Sequential or time-series data.
What is a recurrent neuron?
A neuron that receives feedback from its previous output to influence future outputs.
How is an RNN trained?
Using backpropagation through time (BPTT), unrolling the network across time steps.
What is the “unrolling” of an RNN?
Representing the same RNN layer at multiple time steps to visualize flow across time.
What are the two main weight matrices in a recurrent neuron?
One for the current input (Wx) and one for the previous output (Wy).
What is a memory cell in RNNs?
A structure that preserves state over time, helping the network retain information.
What is the sequence-to-sequence architecture?
A model that takes a sequence as input and produces a sequence as output (e.g., time-series forecasting).
What is a sequence-to-vector model?
A model that takes a sequence input and produces a single output (e.g., sentiment analysis).
What is a vector-to-sequence model?
A model that takes a single input and generates a sequence (e.g., image captioning).
What is an encoder-decoder model in NLP?
A model that encodes an input sequence to a vector and decodes it into an output sequence (e.g., translation).
What is the difference between naive forecasting and deep learning for time series?
Naive forecasting predicts the last value again; deep models learn patterns for better accuracy.
What is MSE and why is it used?
Mean Squared Error; a loss function used to measure prediction accuracy in time series.
What is the drawback of predicting one step at a time in time series forecasting?
Error accumulation over successive steps.
What is the advantage of predicting multiple steps at once in time series forecasting?
Reduced error accumulation and more stable training gradients.
What is the main issue when handling long sequences in RNNs?
Unstable gradients and memory loss of earlier inputs.
What is batch normalization and how does it help RNNs?
Normalization across batches; it can stabilize training but is difficult to apply across time.
What is layer normalization in RNNs?
Normalization across feature dimensions, easier to use than batch normalization for RNNs.
What is the short-term memory problem in RNNs?
RNNs forget inputs after many time steps due to vanishing gradients.
What is an LSTM cell?
A memory cell with gates (forget, input, output) that maintains long-term memory in sequences.
What is a GRU cell?
A simplified LSTM with fewer gates, merging hidden and cell states.
How do 1-D convolutional layers work in sequence modeling?
They apply filters across time steps to detect temporal patterns.
What is WaveNet and what does it do?
A deep neural network for sequence generation using dilated convolutions to capture long-range dependencies.
What is the role of dilation in WaveNet?
It allows the network to look back further in time without increasing the number of layers.