LLM Flashcards
(23 cards)
What is subword tokenization?
Words split into meaningful sub-parts
Common in modern tokenizers
What is the purpose of TrainingArguments() in fine-tuning?
Customize training settings
See documentation for all parameters
What does the output_dir parameter in TrainingArguments represent?
Output directory for the fine-tuned model
What is the significance of num_train_epochs in TrainingArguments?
Number of training epochs
What does learning_rate control in training?
For optimizer’s learning adjustments
What do per_device_train_batch_size and per_device_eval_batch_size define?
The batch size for training and evaluation respectively
What is the role of the Trainer class?
To manage the training process
What is the eval_dataset used for in the Trainer class?
The data used for evaluation during training
In the training output, what does eval_loss indicate?
The loss value during evaluation
How are predicted labels derived from model outputs?
Using torch.argmax on outputs.logits
What does the label_map dictionary represent?
Mapping of predicted labels to sentiment categories
What is the function of the pipeline() method?
Streamlines tasks with automatic model and tokenizer selection
What is full fine-tuning?
Updating the entire model weights
Computationally expensive
What is partial fine-tuning?
Only task-specific layers are updated; some layers are fixed
Define transfer learning in the context of fine-tuning.
Adapting a pre-trained model to a different but related task
What is zero-shot learning?
No examples are provided for the model to learn from
Fill in the blank: In one-shot learning, the model is provided with _______.
one example
What does load_dataset() do?
Loads a dataset for fine-tuning
What is the purpose of tokenization in machine learning?
Converts text into a format suitable for model training
What does the tokenizer function do to the input data?
It processes the text data into tensors for model input
What is the output of the tokenization process?
A dictionary containing input_ids, attention_mask, etc.
What does the tokenize_function do?
Applies tokenization to text data
What is the difference between tokenizing in batches and tokenizing row by row?
Batch tokenization processes multiple rows at once, while row by row processes one at a time