Amazon Bedrock and Generative AI Flashcards

Question

What is RAG in Amazon Bedrock?

Answer 1

A method to enhance model answers by retrieving relevant information from external data sources.

Answer 2

An external data store connected to Bedrock to provide domain-specific context for more accurate responses.

Answer 3

Through a single unified API, making it easy to interact with different models programmatically.

Answer 4

Yes, using LLMs and additional tools like knowledge bases and RAG to create intelligent conversational agents.

Answer 5

Model type, performance, customization options, inference capabilities, licensing, context window, latency, modality support, compliance, and cost.

Answer 6

A model that can accept and produce multiple types of data, such as text, audio, image, and video.

Answer 7

A high-performing foundation model developed by AWS with support for text and image generation, available via Amazon Bedrock.

Answer 8

Yes, it supports fine-tuning using your own data within your AWS account.

Answer 9

Smaller models are more cost-effective but have limited knowledge; larger models are more capable but expensive.

Answer 10

A foundation model created by Meta, focused on English text generation and large-scale tasks.

Answer 11

A foundation model developed by Anthropic, known for its large context window and strong document analysis capabilities.

Answer 12

Image generation using the Stable Diffusion model, useful for advertising and media content.

Answer 13

It allows you to input large documents, code bases, or books, enabling the model to reason over more content.

Answer 14

Content creation, classification, and educational applications.

Answer 15

Analysis, forecasting, and document comparison due to its large context window.

Answer 16

Image generation for advertising, media, and creative projects.

Answer 17

More capable models may be more expensive; choosing a model that balances cost and performance is crucial.

Answer 18

By the number of tokens processed (e.g., cost per 1,000 tokens).

Answer 19

Costs can escalate quickly if usage isn't carefully monitored.

Answer 20

Adapting a copy of a foundation model by training it with your own data to improve performance on domain-specific tasks.

Answer 21

In Amazon S3.

Answer 22

Yes, it updates the model’s weights based on your data.

Answer 23

Provisioned throughput.

Answer 24

No, only some models, typically open-source ones, support fine-tuning.

Answer 25

Fine-tuning using labeled data with prompt-response pairs to improve performance on specific tasks.

Answer 26

Labeled data with prompt-response pairs.

Answer 27

Fine-tuning using unlabeled data to adapt a foundation model to a specific domain.

Answer 28

Domain-adaptation fine-tuning.

Answer 29

When you have large amounts of unlabeled domain-specific data.

Answer 30

Feeding the entire AWS documentation to make the model an AWS expert.

Answer 31

Fine-tuning approaches that teach a model how to handle one-turn or conversational multi-turn chat interactions.

Answer 32

System (optional context), User, and Assistant.

Answer 33

Instruction-based fine-tuning is generally cheaper and uses less data.

Answer 34

A large amount of unlabeled data and more computation, thus higher cost.

Answer 35

Using a pre-trained model and adapting it to a new but related task—fine-tuning is a form of transfer learning.

Answer 36

Using a pre-trained model for edge detection and adapting it to classify a specific kind of image.

Answer 37

Fine-tuning is a specific application of transfer learning tailored to refining model behavior with new data.

Answer 38

When you need a custom tone/persona, work with proprietary data, or aim to improve accuracy for specific tasks.

Answer 39

Labeled data with prompt-response examples.

Answer 40

Unlabeled data, such as raw domain-specific documentation.

Answer 41

It provides dedicated infrastructure for consistent performance with fine-tuned models.

Answer 42

A machine learning engineer, though Bedrock simplifies the process.

Answer 43

It's a feature to evaluate a model for quality control by submitting it tasks and using benchmark datasets, then automatically scoring its performance using judge models.

Answer 44

Text summarization, question and answer, text classification, and open-ended text generation.

Answer 45

They help test the model by comparing its generated answers to ideal (benchmark) answers to assess accuracy.

Answer 46

The judge model compares the model-generated answer to the benchmark answer and assigns a score based on similarity.

Answer 47

Yes, you can use your own or a curated dataset from AWS.

Answer 48

They help measure accuracy, speed, scalability, and detect bias in the model.

Answer 49

Automatic uses judge models and metrics, while human evaluation involves people scoring the outputs based on criteria like relevance or correctness.

Answer 50

Thumbs up/down, ranking, and other grading scales.

Answer 51

Recall-Oriented Understudy for Gisting Evaluation.

Answer 52

Evaluating summarization and machine translation by comparing n-grams in reference and generated text.

Answer 53

A ROUGE metric measuring how many n-grams (e.g., 1-gram, 2-gram) match between reference and generated texts.

Answer 54

It computes the longest common subsequence between the reference and generated text.

Answer 55

Bilingual Evaluation Understudy.

Answer 56

Evaluating the quality of translated text, focusing on precision and penalizing brevity.

Answer 57

Semantic similarity between texts using embeddings and cosine similarity.

Answer 58

Because it compares meanings using embeddings rather than just word overlap.

Answer 59

A measure of how well the model predicts the next token; lower is better.

Answer 60

That the model is confident and accurate in predicting the next token.

Answer 61

They can be used to retrain and improve model outputs over time.

Answer 62

User satisfaction, average revenue per user, cross-domain performance, conversion rates, efficiency.

Answer 63

To evaluate the model using criteria specific to your business needs.

Answer 64

Retrieval Augmented Generation

Answer 65

It allows a foundation model to reference external data sources without fine-tuning.

Answer 66

Amazon Bedrock

Answer 67

Vector database

Answer 68

Vector embeddings of chunks of data for semantic search

Answer 69

Numerical representations of text used to measure similarity

Answer 70

It's augmented with retrieved information from the knowledge base, First model will search all related data to query from vector DB then pass it to main FM "Original Query + Retrieved Text " . then main FM generates final output

Answer 71

Amazon OpenSearch Service, Amazon Aurora

Answer 72

MongoDB, Redis, Pinecone

Answer 73

AWS automatically creates a serverless OpenSearch vector database

Answer 74

Amazon Titan, Cohere

Answer 75

To split them into smaller parts for vector embedding and search

Answer 76

Real-time similarity search with scalable index management and KNN support

Answer 77

NoSQL compatibility and support for real-time vector similarity search

Answer 78

Amazon Aurora, Amazon RDS for PostgreSQL

Answer 79

Amazon Neptune

Answer 80

Amazon S3, Confluence, SharePoint, Salesforce, Webpages

Answer 81

Building a chatbot that retrieves answers from product documentation and FAQs

Answer 82

Chatbot answering legal queries based on case law, regulations, and legal opinions

Answer 83

AI assistant answering medical questions based on treatments and research papers

Answer 84

It is the process of converting raw text into a sequence of tokens.

Answer 85

Word-based and subword-based tokenization.

Answer 86

It helps handle long or rare words by breaking them into common sub-parts, reducing the number of unique tokens.

Answer 87

A token is a unit of text such as a word, part of a word, or even punctuation, each mapped to a unique ID.

Answer 88

It's easier and more efficient for models to work with numeric IDs than raw text.

Answer 89

It is the number of tokens a model can consider at one time for understanding and generating text.

Answer 90

It allows the model to handle more information, increasing coherence and performance.

Answer 91

Google Gemini 1.5 Pro – up to 10 million tokens.

Answer 92

They are vectors (arrays of numbers) that represent the meaning and context of tokens, words, images, or audio.

Answer 93

They allow models to understand semantic meaning and similarity between inputs.

Answer 94

Text is tokenized, then each token is passed through an embeddings model to generate a numerical vector.

Answer 95

Vector databases store and index embeddings for similarity search.

Answer 96

They can capture multiple features like meaning, sentiment, and syntax of the token.

Answer 97

To visualize high-dimensional embeddings in 2D or 3D for interpretation.

Answer 98

By plotting reduced-dimension vectors or comparing color-coded embedding patterns.

Answer 99

They are close together in the vector space, allowing similarity search.

Answer 100

They enable semantic search by finding contextually related results, not just keyword matches.

Answer 101

They are controls that help manage interactions between users and Foundation Models by filtering undesirable or harmful content.

Answer 102

They can block specific topics (e.g., food recipes), PII, or other harmful content.

Answer 103

If a Guardrail is set to block food recipes and the user asks for a cooking suggestion, the model will respond that it's a restricted topic.

Answer 104

They can automatically detect and remove personally identifiable information (PII).

Answer 105

They help reduce hallucinations by ensuring answers are factual and controlled.

Answer 106

Yes, you can configure multiple Guardrails and apply different levels of control.

Answer 107

You can analyze user inputs that violate Guardrails to refine and improve their configuration.

Answer 108

They enhance safety, privacy, and compliance by restricting harmful or unwanted content.

Answer 109

It is an intelligent component that can perform multi-step tasks, integrate with systems, and act beyond just answering questions.

Answer 110

They define specific tasks or API endpoints the agent can interact with to perform operations.

Answer 111

They use action groups configured via APIs or AWS Lambda functions to execute tasks.

Answer 112

Accessing customer purchase history, recommending products, and placing new orders.

Answer 113

OpenAPI schema.

Answer 114

Yes, they can retrieve relevant information from defined knowledge bases for context-aware responses.

Answer 115

It is the process where the agent plans a step-by-step execution strategy using a GenAI model.

Answer 116

It uses prompt, conversation history, available actions, and knowledge base context to plan tasks.

Answer 117

AWS Lambda.

Answer 118

By using the tracing feature, which shows the sequence of actions taken by the agent.

Answer 119

Through model invocation logging and publishing metrics to CloudWatch.

Answer 120

It records all inputs and outputs of model calls, including text, images, and embeddings, to CloudWatch Logs or S3.

Answer 121

CloudWatch Logs Insights.

Answer 122

Provides a full history and enables monitoring, debugging, and auditing of Bedrock usage.

Answer 123

Tracks how often content is filtered by guardrails in Bedrock.

Answer 124

Yes, using CloudWatch Alarms based on published metrics.

Answer 125

On-Demand, Batch Mode, and Provisioned Throughput.

Answer 126

Based on input and output tokens for text models, input tokens for embeddings, and per image for image models.

Answer 127

Up to 50% discount compared to On-Demand, with responses delivered later.

Answer 128

To reserve guaranteed throughput for base, fine-tuned, or custom models.

Answer 129

When using fine-tuned, custom, or imported models.

Answer 130

Very low or free, as it doesn’t require training or compute resources.

Answer 131

External infrastructure like vector databases—not model retraining.

Answer 132

The number of input and output tokens processed.

Answer 133

No, they do not impact the cost.

Amazon Bedrock and Generative AI Flashcards

(159 cards)