chapter 16 Flashcards
(21 cards)
Why are RNNs useful for Natural Language Processing (NLP)?
They process sequences, making them suitable for tasks involving word or character order.
What are common NLP tasks?
Text generation, sentiment analysis, and machine translation.
What is a character-level RNN?
An RNN that predicts the next character in a sequence based on previous characters.
What dataset was used for training in the character RNN example?
The complete works of William Shakespeare.
How are characters encoded for training in RNNs?
Using one-hot encoding or integer encoding.
What is ‘truncated backpropagation through time’?
A technique to train RNNs on shorter sequences instead of full-length texts.
What does the ‘temperature’ parameter control in text generation?
The randomness of the generated text; lower values make it more deterministic.
What is the difference between stateless and stateful RNNs?
Stateless resets state after each batch; stateful preserves state between batches.
Why use a stateful RNN?
To maintain long-term dependencies over sequences across batches.
What is sentiment analysis?
Classifying text based on emotional tone, e.g., positive or negative reviews.
What dataset is used for sentiment analysis in the lecture?
IMDb movie review dataset.
What is an embedding layer?
A layer that maps each word ID to a dense vector capturing semantic similarity.
What does ‘mask_zero=True’ do in an embedding layer?
Ignores padding tokens during training.
What is an encoder-decoder architecture used for?
Translation and sequence generation tasks.
Why is the input reversed in encoder-decoder models?
To make the first words more accessible to the decoder.
What is a GRU cell?
A simplified version of LSTM with fewer gates and a single hidden state.
How does an RNN generate long text?
By generating one character at a time and feeding the output back as input.
What is the TimeDistributed layer used for?
To apply the same Dense layer across each time step.
Why does character-level RNN have limited context?
Because it typically only looks back 100 characters.
How can you improve a Char-RNN?
Use deeper networks, tune temperature, or increase training data.
Why is dropout used in RNNs?
To prevent overfitting by randomly deactivating neurons during training.