AWS cert2 Flashcards

Question

Amazon Kendra

Answer 1

Amazon Kendra is an intelligent search service that provides answers to questions based on the data that is provided. Amazon Kendra uses semantic and contextual understanding to provide specific answers.

Answer 2

Amazon Q Business is a generative AI virtual assistant that can answer questions, summarize content, generate content, and complete tasks based on the data that is provided. Amazon Q Business does not provide access to FMs.

Answer 3

Retrieval-Augmented Generation (RAG) is the process of optimizing the output of a large language model, so it references an authoritative knowledge base outside of its training data (external) sources before generating a response.

Answer 4

Fine-tuning refers to the process of taking a pre-trained language model and further training it on a specific tasks or domain-specific dataset. It requires a labeled dataset. Benefits of fine-tuning: Increase specificity, Improve accuracy, Reduce bias, Boost efficiency.

Answer 5

Using unlabeled data - industry specific and domain specific unlabeled data, continue pre-training the FM. It is costly and requires expertise.

Answer 6

Instruction-based fine-tuning improves the performance of a pre-trained foundation model (FM) on domain-specific tasks. Instruction-based fine-tuning uses labeled examples that are formatted as prompt-response pairs and that are phrased as instructions.

Answer 7

This approach uses a pre-trained model to improve its performance only in a specific domain. This method is more about making the model knowledgeable in a specific domain rather than improving its ability to manage complex conversational tasks or adapt to individual user preferences.

Answer 8

Method where a model developed for one task is reused as the starting point for a model on a second task. Suitable for solving Natural Language Processing problems (example image).

Answer 9

PEFT (Parameter-Efficient Fine-Tuning) refers to techniques that aim to fine-tune large language models efficiently, without having to update all of the model's parameters. This helps reduce the amount of task-specific data and computational resources required.

Answer 10

LoRA (Low-Rank Adaptation), a PEFT method, adds a low-rank adaptation module to the model, which can be fine-tuned while keeping the original model parameters frozen.

Answer 11

Learns taskspecific prefix vectors that are prepended to the input, without modifying the original model.

Answer 12

Chain of thought is a prompt engineering technique that breaks down a complex question into smaller parts. A recommended technique when you have arithmetic and logical tasks that require reasoning.

Answer 13

RLHF is a technique for fine-tuning large language models to align them with human values and preferences, by collecting human feedback on model outputs and then using reinforcement learning to reward model behaviors that match those human preferences. The goal of RLHF is to train models to behave in ways that are more consistent with human values, beyond just optimizing for task performance.

Answer 14

Ignoring the prompt template (asked in sample exam). This general attack consists of a request to ignore the model's given instructions. For example, if a prompt template specifies that an LLM should answer questions only about the weather, a user might ask the model to ignore that instruction and to provide information on a harmful topic. Others - Prompted persona switches, Fake completion (guiding the LLM to disobedience), rephrasing or obfuscating common attacks.... refer to list https://docs.aws.amazon.com/prescriptive-guidance/latest/llm-prompt-engineering-best-practices/common-attacks.html

Answer 15

Unfair prejudice or preference that favors or disfavors a person or group.

Answer 16

Impartial and just treatment without discrimination

Answer 17

When a model performs well on training data but fails to generalize to new data; Low bias and high variance: Low bias indicates that the model is not making erroneous assumptions about the training data. High variance indicates that the model is paying attention to noise in the training data and is overfitting.

Answer 18

When a model is too simple to capture the underlying patterns in the data. High bias and low variance: High bias indicates that the model is making erroneous assumptions about the training data. Low variance indicates that the model is not paying attention to noise in the training data, which will lead to underfitting.

Answer 19

Low bias indicates that the model is not making erroneous assumptions about the training data. Prevents underfitting. Low variance indicates that the model is not paying attention to noise in the training data. Prevents overfitting. Low bias and low variance is an ideal outcome for model training as it does not result in model overfitting / underfitting.

Answer 20

The ability to understand how a model arrives at a prediction, explain the behaviour in human terms. Sagemaker Clarify uses methods like SHAP, Partial dependency plots(pdp) etc. Answers WHY a model made a specific decision

Answer 21

The ability to explain and understand the internal decision-making process of a machine learning model = "The What". Helps users understand how a model combines features to make predictions

Answer 22

The key benefit of generative AI models is their ability to produce novel, human-like outputs based on the data they are trained on. This makes them highly versatile and applicable across a wide range of domains and use cases.

Answer 23

Quality of generated summaries or translations compared to reference texts

Answer 24

Similarity between generated text translations and reference translations

Answer 25

Evaluates semantic similarity between generated text. Compares the "meaning" and similarities of the text being compared

Answer 26

Top-K is a parameter used in language models to limit the selection of tokens to the K most probable options during text generation, controlling the balance between diversity and predictability in the output.

Answer 27

Top P is a setting that controls the diversity of the text by limiting the number of words that the model can choose from based on their probabilities. Top P is set on a scale of 0-1. Low Top P (like 0.25), the model will only consider words that make up the top 25% of the total probability distribution. This can help the output to be more focused and coherent because the model is limited to choosing from the most probable words given the context. High top P (0.99) - the model will consider a broad range of possible words for the next word in the sequence because it will include words that make up the top 99% of the total probability distribution. This can lead to more diverse and creative outputs because the model has a wider pool of words to choose from.

Answer 28

Controls the randomness or diversity of the generated outputs. A higher temperature value increases the probability of sampling from less likely or lower-probability output tokens, resulting in a more diverse and unpredictable response. A lower temperature value favors the most probable outputs, leading to more deterministic and repetitive respones.

Answer 29

F1 score balances precision and recall by combining them in a single metric. The F1 score is a metric that you can use to evaluate classification models. F1 = 2 * P * R / P + R ( P = Precision , R = Recall )

Answer 30

Correct predictions / All predictions. the percentage of correct predictions on a 0-1 scale. Accuracy is not a good measure when the data has class imbalance. Accuracy measures how close the predicted class values are to the actual valuesof true positives (TP) and true negatives (TN) to the total number of predictions

Answer 31

Precision measures how well an algorithm predicts true positives out of all the positives that it identifies. This is a good quality metric to use when your goal is to minimize the number of false positives. True positives/(true positives + false positives)

Answer 32

Mean squared error, or the average of the squared differences between the predicted and actual values. MSE values are always positive. The better a model is at predicting the actual values, the smaller the MSE value is. MSE is used to evaluate the performance of regression models.

Answer 33

MAPE is the mean of the absolute differences between the actual values and the predicted values, divided by the actual values. You can use MAPE in numeric predictions to understand model prediction errors.

Answer 34

MAE measures how different the predicted and actual values are when the values are averaged over all values. You can use MAE in numeric predictions to understand model prediction errors.

Answer 35

R-squared measures how much of the variation in your data is explained by your model, ranging from 0 (no explanation) to 1 (perfect explanation). For example, if R2=0.8, it means 80% of the variation in your data is accounted for by the model, while 20% is still unexplained. It helps evaluate model fit, but a high R2 doesn’t always mean the model is good—it could overfit or miss other important factors.

Answer 36

Root Mean Squared Error, or the standard deviation of the errors. Measures the square root of the squared difference between predicted and actual values, and is averaged over all values. It is used to understand model prediction error, and it's an important metric to indicate the presence of large model errors and outliers. Values range from zero (0) to infinity, with smaller numbers indicating a better model fit to the data. RMSE is dependent on scale, and should not be used to compare datasets of different types.

Answer 37

AI hallucinations are incorrect or misleading results that AI Models generate. These errors can be caused by a variety of factors, including insufficient training data, incorrect assumptions made by the model, or biases in the data used to train the model.

Answer 38

A generative adversarial network (GAN) is a deep learning architecture. It trains two neural networks to compete against each other to generate more authentic new data from a given training dataset.

Answer 39

A variational autoencoder (VAE) provides a probabilistic manner for describing an observation in latent space.

Answer 40

Transformers are a type of neural network architecture that transforms or changes an input sequence into an output sequence. They do this by learning context and tracking relationships between sequence components.

Answer 41

Real-time inference allows you to deploy your model to SageMaker hosting services and get a fully managed, autoscaling endpoint that can be used for real-time inference. Serverless inference lets you deploy and scale without managing any underlying architecture. Asynchronous inference queues incoming, large requests <1GB and processes them asynchronously. Batch transform is for batch inference, also known as offline inference.

Answer 42

Resource to help customers better understand our AWS AI services to enhance transparency and advance responsible AI

Answer 43

helps debug and optimize machine learning models by monitoring and profiling training jobs in real-time.

Answer 44

Amazon A2I is a service to build human review systems for ML solutions. You can use Amazon A2I to create a workflow for human reviewers to audit individual predictions. Amazon A2I is not a reporting tool designed to support system-level compliance audits.

Answer 45

Amazon SageMaker Autopilot is an automated ML (AutoML) tool that simplifies and automates the process of building and deploying ML models for application owners.

Answer 46

"Epoch" refers to a single complete pass through the entire training dataset during the process of training a machine learning model.

Answer 47

The Learning rate hyperparameter controls the step size at which a model's parameters are updated during training. It determines how quickly or slowly the model's parameters are updated during training.

Answer 48

Supervised learning is a type of machine learning where the algorithm is trained on a labeled dataset, meaning the input data is paired with the correct output. The goal is for the algorithm to learn the mapping between input and output so it can accurately predict outcomes for new, unseen data.

Answer 49

Unsupervised learning involves training algorithms on unlabeled data, without predefined outputs or correct answers. The goal is for the algorithm to discover hidden patterns, structures, or relationships within the data on its own, often used for clustering, dimensionality reduction, or anomaly detection.

Answer 50

Semi-supervised learning is a hybrid approach that combines elements of both supervised and unsupervised learning, using a small amount of labeled data along with a larger amount of unlabeled data. This method aims to leverage the benefits of both approaches, improving model performance when fully labeled datasets are scarce or expensive to obtain.

Answer 51

Stop sequences are specific tokens or phrases that instruct an AI model to cease generating text at a designated point, such as the end of a sentence or list.

Answer 52

Feature extraction is the technique of creating new features by transforming or combining the original input features.

Answer 53

Feature selection is the process of choosing a subset of the most relevant features from a dataset. It aims to reduce dimensionality.

Answer 54

With Amazon Bedrock Knowledge Bases, you can give FMs and agents contextual information from your company’s private data sources for RAG to deliver more relevant, accurate, and up-to-date response.

Answer 55

Is an evaluation matrix used for evaluating Classification algorithm models.Accuracy is most effective with balanced datasets

Answer 56

A sequence of characters that a model can interpret or predict as a single unit of meaning. For example, with text models, a token could correspond not just to a word, but also to a part of a word with grammatical meaning (such as "-ed"), a punctuation mark (such as "?"), or a common phrase (such as "a lot").

Answer 57

The context window is a model property that describes the number of tokens that the model can accept in the context.

Answer 58

Latent space refers to the encoded knowledge within an LLM, representing complex relationships and patterns learned from the massive datasets used during training.

Answer 59

A confusion matrix is a table that compares the predictions of a classification model to the actual values of a dataset. A confusion matrix is used to summarize the performance of a classification model when it's evaluated against test data.

Answer 60

Poisoning refer to the intentional introduction of malicious or biased data into the training dataset of a model. This can lead the model producing biased, offensive, or harmful outputs, either intentionally or unintentionally.

Answer 61

Hijacking and prompt injection refer to the technique of influencing the outputs of generative models by embedding specific instructions within the prompts themselves. The goal is to hijack the model's behaviour and make it produce outputs that align with the attacker's intentions, such as generating misinformation or running malicious code.

Answer 62

Prompt leaking refers to the unintentional disclosure or leakage of the prompts or inputs (regardless of whether these are protected data or not) used within a model. Prompt leaking does not necessarily expose protected data. But it can expose other data used by the model, which can reveal information of how the model works and this can be used against it.

Answer 63

Jailbreaking refers to the practice of modifying or circumventing the constraints and safety measures implemented in a generative model or AI assistant to gain unauthorized access or functionality. Jailbreaking attempts involve crafting carefully constructed prompts or input sequences that aim to bypass or exploit vulnerabilities in the AI system's filtering mechanism or constraints. The goal is to "break out" of the intended model limitations.

Answer 64

An evaluation from Amazon SageMaker Studio that measures the probability that your model encodes biases in its response. These biases include those for race, gender, sexual orientation, religion, age, nationality, disability, physical appearance, and socioeconomic status.

Answer 65

SageMaker Clarify provides feature attributions based on the concept of Shapley value, used to determine the contribution that each feature made to model predictions. These attributions can be provided for specific predictions and at a global level for the model as a whole.

Answer 66

Negative prompting refers to guiding a generative AI model to avoid certain outputs or behaviors when generating content.

Answer 67

Prompt templates are predefined formats that you can be used to standardize inputs and outputs for AI models.

Answer 68

Amazon Q Developer is a generative artificial intelligence (AI) powered code assistant.

Answer 69

Amazon SageMaker Feature Store is a fully managed, purpose-built repository to store, share, and manage features for machine learning (ML) models. Features are inputs to ML models used during training and inference.

AWS cert2 Flashcards

(93 cards)