LLM Concepts Flashcards by Art Arriaga

What is a Large Language Model (LLM)?

A deep learning model trained on large corpora of text to understand and generate human language.

How well did you know this?

Not at all

Perfectly

What architecture do most LLMs use?

The transformer architecture.

How well did you know this?

Not at all

Perfectly

What is the transformer model?

A model that uses self-attention mechanisms to process sequences in parallel.

How well did you know this?

Not at all

Perfectly

What is self-attention?

A mechanism where a model learns which parts of a sequence are relevant to each other.

How well did you know this?

Not at all

Perfectly

What is positional encoding in transformers?

Information added to input tokens to preserve word order.

How well did you know this?

Not at all

Perfectly

What is a token?

A unit of text, often a word or subword, processed by the model.

How well did you know this?

Not at all

Perfectly

What is tokenization?

The process of converting text into tokens.

How well did you know this?

Not at all

Perfectly

What is pretraining in LLMs?

Training the model on large unlabeled text data to learn general language patterns.

How well did you know this?

Not at all

Perfectly

What is fine-tuning?

Adapting a pretrained model to a specific task with additional labeled data.

How well did you know this?

Not at all

Perfectly

What is masked language modeling?

A training task where some input tokens are hidden and the model must predict them.

How well did you know this?

Not at all

Perfectly

What is causal language modeling?

A training task where the model predicts the next token in a sequence.

How well did you know this?

Not at all

Perfectly

What is GPT?

Generative Pretrained Transformer — a causal language model trained to predict the next word.

How well did you know this?

Not at all

Perfectly

What is BERT?

Bidirectional Encoder Representations from Transformers — a masked language model.

How well did you know this?

Not at all

Perfectly

How is GPT different from BERT?

GPT is unidirectional and generative; BERT is bidirectional and better for understanding.

How well did you know this?

Not at all

Perfectly

What is zero-shot learning?

Making predictions without seeing task-specific examples during training.

How well did you know this?

Not at all

Perfectly

What is few-shot learning?

Learning a task with only a few examples.

How well did you know this?

Not at all

Perfectly

What is instruction tuning?

Training LLMs to follow instructions in natural language.

How well did you know this?

Not at all

Perfectly

What is prompt engineering?

The craft of designing effective input prompts to guide LLM behavior.

How well did you know this?

Not at all

Perfectly

What is a system prompt?

A special prompt that guides the behavior of an LLM session.

How well did you know this?

Not at all

Perfectly

What is context window?

Study These Flashcards

The maximum number of tokens an LLM can process at once.

What is attention mechanism?

Study These Flashcards

A method that lets models focus on different parts of the input when making predictions.

What is temperature in text generation?

Study These Flashcards

A parameter that controls randomness — higher values yield more diverse outputs.

What is top-k sampling?

Study These Flashcards

A decoding method that samples from the top k most likely next tokens.

What is top-p (nucleus) sampling?

Study These Flashcards

A method that samples from the smallest set of tokens with a cumulative probability > p.

What is beam search?

A decoding strategy that keeps multiple candidate sequences at each step.

What is hallucination in LLMs?

When a model generates text that is fluent but factually incorrect.

What is RLHF?

Reinforcement Learning from Human Feedback — used to align LLMs with human preferences.

What is a language model's perplexity?

A measure of how well the model predicts a sample — lower is better.

What is an embedding?

A numeric representation of text that captures semantic meaning.

What is a vector database?

A database designed to store and search embeddings efficiently.

What is retrieval-augmented generation (RAG)?

A technique that combines external knowledge retrieval with generation.

What is chain-of-thought prompting?

A method where the model is encouraged to explain its reasoning before answering.

What is a decoder-only transformer?

A transformer model that generates output sequentially, like GPT.

What is an encoder-only transformer?

A transformer that creates contextual embeddings for input, like BERT.

What is an encoder-decoder transformer?

A model architecture used for translation or summarization, like T5.

What is T5?

Text-to-Text Transfer Transformer — an encoder-decoder model that treats all tasks as text-to-text.

What is parameter count in LLMs?

The number of learnable weights in the model — indicates model size.

Why do LLMs need large datasets?

To capture a wide range of language patterns and knowledge.

What is data contamination in training?

When test data is accidentally included in training data.

What is a safety filter in LLMs?

A mechanism to block harmful or inappropriate outputs.

What is the alignment problem?

Ensuring that AI systems behave in accordance with human intent.

What is a language model's vocabulary?

The set of tokens it can recognize and generate.

What is model distillation?

Compressing a large model into a smaller one that approximates its behavior.

What is quantization?

Reducing the precision of model weights to decrease memory usage.

What is model pruning?

Removing unnecessary weights or neurons to simplify the model.

What is latency in LLMs?

The time it takes to generate a response after receiving input.

What is inference in LLMs?

The process of generating predictions using a trained model.

What is an API endpoint in LLMs?

A service interface to interact with the model programmatically.

What is multi-modal LLM?

A model that processes and generates multiple data types, such as text and images.

LLM Concepts Flashcards

(49 cards)