Prompt Engineering Flashcards

Question

What does Fallback Behavior refer to?

Answer 1

The model's default response when encountering input outside its training distribution or intended domain.

Answer 2

A statistical measure of a model’s accuracy that balances precision and recall, commonly used in classification tasks.

Answer 3

Adjusting model outputs toward better factual accuracy using postprocessing or model-level weighting.

Answer 4

A logical representation of file location, often used in LLM systems to simulate file directories (e.g., /mnt/data/file.txt).

Answer 5

The specific goal or use case for which a model is being fine-tuned, such as classification, summarization, or translation.

Answer 6

A storage scheme where all files exist at a single level, without folders or hierarchy.

Answer 7

A model parameter that reduces the probability of words being repeated in a single generation.

Answer 8

The ability of a model to interact with external tools, APIs, or code execution environments to augment its responses.

Answer 9

The process of evaluating model outputs against known correct answers to measure performance or accuracy.

Answer 10

A file storage structure where documents are organized in nested directories or folders.

Answer 11

A format for displaying structured content using HyperText Markup Language; useful for formatting complex output like tables.

Answer 12

A workflow in which humans review or correct model outputs, often used in sensitive or critical applications.

Answer 13

A refinement process where new data is continually added to a model's training to improve its performance over time.

Answer 14

The method by which input documents are analyzed or broken down into parts such as sections, headers, or tables.

Answer 15

A predefined directive prepended to the prompt to shape or control model behavior (e.g., 'You are a helpful assistant...').

Answer 16

A fine-tuning method where the model is trained on pairs of instructions and desired outputs to improve alignment with user intent.

Answer 17

The delay between submitting a prompt and receiving a model response; a critical performance metric in real-time applications.

Answer 18

A compressed, high-dimensional internal state that captures the semantic features of input data.

Answer 19

A generation pattern where the model adheres strictly to a single line of reasoning or narrative progression.

Answer 20

A parameter-efficient fine-tuning technique that inserts low-rank matrices into existing layers of a model.

Answer 21

A plain-text markup format that supports structured formatting, often used for documents generated by LLMs.

Answer 22

A generation constraint defining the maximum number of tokens (input plus output) in a single response.

Answer 23

The ability of an LLM system to retain session-specific or user-specific context between interactions.

Answer 24

The practice of prompting a model to generate or modify other prompts, typically for optimization or automation.

Answer 25

The use of composable prompt segments or chains to control model reasoning and output structure.

Answer 26

Adjusting a model’s responses to reduce or rebalance systemic biases introduced during pretraining.

Answer 27

A tuning mechanism to guide output preference toward certain input types such as images or text.

Answer 28

Support for interacting with multiple types of media (e.g., text, images, audio) as input and/or output.

Answer 29

The coordination of multiple LLM instances (agents), each with a specialized role, to complete a task collaboratively.

Answer 30

A token selection strategy that limits generation to a subset of tokens whose cumulative probability meets a threshold.

Answer 31

User-controlled or experimental settings (e.g., Emotion Bias, Logical Coherence) that adjust generation style or content bias.

Answer 32

The consistency of model output when given the same input and configuration; influenced by seed and temperature.

Answer 33

A condition where a model performs well on its training data but poorly on new, unseen inputs due to excessive memorization.

Answer 34

Techniques like LoRA or adapters that fine-tune only a subset of model parameters for faster adaptation.

Answer 35

A parameter that penalizes tokens that have already appeared in the prompt, encouraging diversity in output.

Answer 36

The method of linking multiple prompts together so that the output of one becomes the input of the next.

Answer 37

The practice of designing and refining prompts to elicit better or more accurate responses from a model.

Answer 38

A structured format expected by the model to guide response generation (e.g., Input: ___ | Instruction: ___).

Answer 39

The coordination and management of prompt flows, structures, and dependencies in a multi-step or multi-agent system.

Answer 40

A reusable prompt framework with placeholders for dynamic content, used to standardize interactions.

Answer 41

A value used to initialize the random number generator in language models, making output reproducible.

Answer 42

A tuning parameter that reduces the chance of repeated phrases or tokens in generated text.

Answer 43

A technique where the model retrieves relevant documents before generating responses to ground its output.

Answer 44

Assigning different roles to sections of a prompt or to multiple agents, influencing tone and responsibility.

Answer 45

A metric that measures the overlap between generated and reference summaries, often used in NLP evaluation.

Answer 46

Isolating an LLM or its runtime to prevent interaction with external systems or to safely test prompts and responses.

Answer 47

A method of retrieving documents or data based on meaning and context, often using embeddings.

Answer 48

A document or output object tied to a specific user interaction session within an LLM platform.

Answer 49

The use of prior turns or session memory to ground the current response in historical interaction.

Answer 50

The ability to reference or build upon interactions from earlier sessions in new conversations.

Answer 51

Designated tokens or characters used to signal where the model should stop generating text.

Answer 52

The model’s capability to accurately process and generate data formats like JSON, XML, or tabular formats.

Answer 53

A fine-tuning goal where the model is trained to reduce content while preserving key information.

Answer 54

Logic-based reasoning involving symbols and rules, as opposed to statistical associations.

Answer 55

An instruction injected at the beginning of a session or prompt sequence to define the assistant’s behavior.

Answer 56

A metadata scheme used to classify, label, or search documents based on their properties.

Answer 57

Building specialized LLM agents optimized for narrow tasks like bug triage, legal review, or translation.

Answer 58

A method for controlling randomness in language generation; lower values yield more deterministic outputs.

Answer 59

The limits on how many requests or tokens can be processed per second or per batch.

Answer 60

A generation control setting that promotes or suppresses specific tokens during output.

Answer 61

The level of detail at which token usage is measured for cost calculation.

Answer 62

The maximum number of tokens the model can process in a single interaction (input + output).

Answer 63

The capacity of the model to invoke external services (like code execution, database queries, or calculators).

Answer 64

A sampling method where the model chooses the next token from the k most probable candidates.

Answer 65

Leveraging knowledge from a model trained on one task/domain and applying it to a related one through fine-tuning.

Answer 66

A storage system that indexes embeddings (vector representations) for fast similarity search and retrieval.

Answer 67

A feature that allows users to track changes, retrieve prior versions, or revert documents to earlier states.

Answer 68

A prompt engineering technique where the desired output format is specified within the prompt to ensure consistent and structured responses. ## Footnote Examples include specifying formats like JSON, Markdown, or tables in the prompt.

Answer 69

A prompting strategy where metadata is embedded at the beginning of a document to influence model behavior or classification. ## Footnote Common metadata formats include YAML.

Answer 70

Providing the model with a small number of input-output examples in the prompt to demonstrate the task. ## Footnote This helps guide output formatting or logic.

Answer 71

Using low randomness settings to generate consistent and accurate model outputs, especially useful for technical content. ## Footnote Temperature settings typically range from 0.2 to 0.4.

Answer 72

Inclusion of one or more complete examples before posing a new input to help ground the model’s response. ## Footnote This strategy aids in establishing learned patterns.

Answer 73

Predefined prompt structures with variable placeholders that can be reused across contexts to standardize task instructions. ## Footnote They improve consistency in responses.

Answer 74

A technique where the output of one prompt becomes the input of the next, enabling multi-step reasoning workflows. ## Footnote This allows for more complex interactions.

Answer 75

The practice of explicitly starting prompts with an instructional clause to control tone and response structure. ## Footnote An example might be, 'Explain this as a lawyer...'.

Answer 76

Persistent hidden prompts used to define the model’s role, tone, or behavior across an entire session. ## Footnote These prompts influence the consistency of responses.

Answer 77

Prompting the model to write or optimize other prompts, used for adaptive or agent-based LLM design workflows. ## Footnote This approach improves the efficiency of subsequent prompts.

Answer 78

Assigning roles or personas to parts of a prompt or to simulated agents in a multi-agent system. ## Footnote Roles can include Critic, Engineer, or Advocate.

Answer 79

Supplying retrieved context within the prompt to ground the model’s answer. ## Footnote This can be achieved via semantic search.

Answer 80

Indicating document segmentation explicitly to improve model understanding of document structure. ## Footnote An example would be labeling sections as 'Part 2 of 4'.

Answer 81

Asking the model to assess or critique its own response before finalizing it. ## Footnote This process improves reliability or completeness of answers.

Answer 82

The use of explicit stop sequences in prompts to limit the model’s output, especially in structured formats. ## Footnote This is particularly useful for formats like code or JSON.

Answer 83

Designing prompts with an understanding of how randomness affects creativity, verbosity, and stability of output. ## Footnote Different temperature settings can yield varying levels of output quality.

Answer 84

Writing prompts that deliver high performance with minimal token usage. ## Footnote This is achieved by compressing instructions and avoiding filler language.

Answer 85

Asking the model to perform a task without providing any examples, relying on natural language instruction. ## Footnote It tests the model's ability to generalize from instructions alone.

Answer 86

A condition where prompts contain overlapping or unclear instructions, leading to inconsistent or inaccurate model responses. ## Footnote This can affect the quality of the output generated by a model.

Answer 87

A situation where the input exceeds the model’s context window, causing earlier parts of the prompt to be truncated or ignored. ## Footnote This can lead to loss of important information in the response.

Answer 88

The practice of including too many directives in a single prompt, often reducing clarity and model performance. ## Footnote Simplifying instructions can enhance the quality of the output.

Answer 89

A standardized set of formatting rules and structures used to ensure consistency across prompts within an organization or project. ## Footnote This can help maintain uniformity in how prompts are crafted.

Answer 90

The strategy of managing the number of tokens used in prompts to optimize for performance and cost efficiency. ## Footnote Efficient token usage can lead to better model performance.

Answer 91

The process of selectively including only contextually important content (e.g., in RAG) to reduce noise and improve model focus. ## Footnote This helps in enhancing the relevance of the output.

Answer 92

A tendency for models to produce output that gradually strays from the intended task, often due to unclear structure or poor anchoring. ## Footnote Clear structure can help mitigate this issue.

Answer 93

When a model generates excessively long or irrelevant text due to a lack of defined stop sequences or constraints. ## Footnote Implementing stop sequences can help control output length.

Answer 94

Organizing prompts using labeled or segmented parts (e.g., Input | Task | Output Format) to improve response reliability. ## Footnote This structure can help models understand the task better.

Answer 95

Prompts that are too vague or general, causing the model to guess at user intent rather than follow clear guidance. ## Footnote Providing specific details can enhance the accuracy of the output.

Prompt Engineering Flashcards

(120 cards)