Practice Questions - Amazon AWS Certified AI Practitioner AIF-C01 Flashcards

(154 cards)

1
Q

A company has built an image classification model to predict plant diseases from photos of plant leaves. The company wants to evaluate how many images the model classified correctly. Which evaluation metric should the company use to measure the model’s performance?

A. R-squared score
B. Accuracy
C. Root mean squared error (RMSE)
D. Learning rate

A

B. Accuracy

Accuracy is the correct answer because it directly measures the proportion of correctly classified images out of the total number of images. R-squared score and RMSE are used for regression problems, not classification problems. Learning rate is a hyperparameter used during model training, not an evaluation metric.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

A company uses Amazon SageMaker for its ML pipeline in a production environment. The company has large input data sizes up to 1 GB and processing times up to 1 hour. The company needs near real-time latency. Which SageMaker inference option meets these requirements?
A. Real-time inference
B. Serverless inference
C. Asynchronous inference
D. Batch transform

A

C
Asynchronous inference is the correct answer because it is designed for large payloads and longer processing times, aligning with the company’s needs of up to 1 GB input data and processing times up to 1 hour. While it doesn’t provide truly immediate responses like real-time inference, it offers near real-time latency, which is specified as a requirement. Real-time inference is unsuitable due to its typically smaller payload sizes and shorter processing times. Serverless inference is optimized for low latency but may not handle the 1-hour processing time efficiently. Batch transform is entirely unsuitable as it is for offline processing and does not meet the near real-time requirement.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

A company wants to use language models to create an application for inference on edge devices. The inference must have the lowest latency possible. Which solution will meet these requirements?
A. Deploy optimized small language models (SLMs) on edge devices.
B. Deploy optimized large language models (LLMs) on edge devices.
C. Incorporate a centralized small language model (SLM) API for asynchronous communication with edge devices.
D. Incorporate a centralized large language model (LLM) API for asynchronous communication with edge devices.

A

A

The correct answer is A because deploying optimized small language models (SLMs) directly onto edge devices minimizes latency. SLMs have a smaller size and computational footprint compared to LLMs, leading to faster inference times. Options B, C, and D all introduce latency: B due to the inherent computational demands of LLMs on resource-constrained edge devices, and C and D because of the communication overhead involved in using a centralized API. Asynchronous communication, while offering other benefits, inherently adds delay compared to on-device processing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

A company wants to build an ML model using Amazon SageMaker and needs to share and manage variables for model development across multiple teams. Which SageMaker feature best meets these requirements?
A. Amazon SageMaker Feature Store
B. Amazon SageMaker Data Wrangler
C. Amazon SageMaker Clarify
D. Amazon SageMaker Model Cards

A

A. Amazon SageMaker Feature Store

The correct answer is A because Amazon SageMaker Feature Store provides a centralized repository for storing, managing, and sharing features (variables) used in machine learning models. This allows multiple teams to collaborate effectively and ensures consistency in feature usage across different models. Options B, C, and D are incorrect because they do not directly address the need for centralized sharing and management of variables across multiple teams for model development. Data Wrangler focuses on data preparation, Clarify on model bias detection, and Model Cards on model documentation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

A company possesses petabytes of unlabeled customer data intended for use in an advertisement campaign. The company aims to classify its customers into tiers for targeted advertising and product promotion. Which methodology is most appropriate for this task?

A. Supervised learning
B. Unsupervised learning
C. Reinforcement learning
D. Reinforcement learning from human feedback (RLHF)

A

B. Unsupervised learning

Unsupervised learning is the correct answer because the company has unlabeled data and needs to identify patterns and groupings within that data to classify customers into tiers. Supervised learning requires labeled data, which is not available. Reinforcement learning and RLHF focus on learning through trial and error and feedback, which are not directly applicable to the problem of initial customer classification. Clustering techniques, a core component of unsupervised learning, are perfectly suited to this task.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

An AI practitioner wants to use a foundation model (FM) to design a search application. The search application must handle queries that have text and images. Which type of FM should the AI practitioner use to power the search application?
A. Multi-modal embedding model
B. Text embedding model
C. Multi-modal generation model
D. Image generation model

A

A. Multi-modal embedding model

The correct answer is A because multi-modal embedding models are designed to process and understand multiple data types, including text and images. This is precisely what is needed for a search application that accepts queries containing both text and images.

Option B is incorrect because text embedding models only handle text data and would not be able to process image queries. Option C is incorrect because multi-modal generation models are focused on creating new content (text and/or images), not on searching existing data. Option D is incorrect because image generation models are solely focused on generating images and cannot handle text-based queries.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

A company wants to use AI to protect its application from threats. The AI solution needs to check if an IP address is from a suspicious source. Which solution meets these requirements?
A. Build a speech recognition system.
B. Create a natural language processing (NLP) named entity recognition system.
C. Develop an anomaly detection system.
D. Create a fraud forecasting system.

A

C

Anomaly detection is the correct answer because it focuses on identifying unusual patterns in data. In this context, an anomaly detection system can analyze IP address access patterns and flag deviations from normal behavior, indicating potentially suspicious activity. Options A and B are incorrect because they are not relevant to identifying suspicious IP addresses. Speech recognition deals with audio, and NLP named entity recognition deals with text. Option D, fraud forecasting, is focused on predicting future fraud rather than detecting it in real-time based on an immediate event like an IP address access.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

A company uses machine learning (ML) models to forecast demand each quarter, informing operational optimization decisions. An AI practitioner is creating a report to explain these models to company stakeholders. Which of the following should the AI practitioner include in the report to ensure transparency and explainability?

A. Code for model training
B. Partial dependence plots (PDPs)
C. Sample data for training
D. Model convergence tables

A

B

The correct answer is B, Partial dependence plots (PDPs). PDPs visualize the relationship between model features and predictions, making it easy to understand how changes in input variables affect forecasts. This is crucial for stakeholder understanding without requiring in-depth knowledge of the model’s internal workings.

Option A, code for model training, is incorrect because it is too technical for most stakeholders. Option C, sample data for training, might raise privacy concerns and is unnecessary for explaining model predictions. Option D, model convergence tables, is relevant for model developers but less so for stakeholders concerned with understanding the model’s outputs and their impact.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

A law firm wants to build an AI application using large language models (LLMs) to read legal documents and extract key points. Which solution best meets these requirements?

A. Build an automatic named entity recognition system.
B. Create a recommendation engine.
C. Develop a summarization chatbot.
D. Develop a multi-language translation system.

A

C

The correct answer is C because the core requirement is to extract key points from legal documents, which is the function of a summarization chatbot. Options A, B, and D are incorrect. A named entity recognition system (A) identifies predefined entities (names, places, etc.), not key points of an argument. A recommendation engine (B) suggests related items, not summarizes information. A multi-language translation system (D) translates languages, not extracts key points. A summarization chatbot uses LLMs to condense information, directly addressing the law firm’s needs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

A company wants to create a chatbot using a foundation model (FM) on Amazon Bedrock. This FM needs to access encrypted data stored in an Amazon S3 bucket encrypted with Amazon S3 managed keys (SSE-S3). The FM fails to access the S3 bucket data. Which solution will resolve this issue?

A. Ensure that the role that Amazon Bedrock assumes has permission to decrypt data with the correct encryption key.
B. Set the access permissions for the S3 buckets to allow public access to enable access over the internet.
C. Use prompt engineering techniques to tell the model to look for information in Amazon S3.
D. Ensure that the S3 data does not contain sensitive information.

A

A

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

A company wants to use generative AI to increase developer productivity and software development. The company wants to use Amazon Q Developer. What can Amazon Q Developer do to help the company meet these requirements?
A. Create software snippets, reference tracking, and open source license tracking.
B. Run an application without provisioning or managing servers.
C. Enable voice commands for coding and providing natural language search.
D. Convert audio files to text documents by using ML models.

A

A

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

A financial institution is using Amazon Bedrock to develop an AI application hosted within a VPC. Due to regulatory compliance, this VPC is not permitted to access the internet. Which AWS service or feature best addresses this requirement?

A. AWS PrivateLink
B. Amazon Macie
C. Amazon CloudFront
D. Internet gateway

A

A. AWS PrivateLink

AWS PrivateLink enables private connectivity between a VPC and AWS services, eliminating the need for internet access. This directly addresses the requirement of the financial institution’s isolated VPC needing to access Amazon Bedrock.

Option B, Amazon Macie, is a data security and privacy service; it does not provide private connectivity. Option C, Amazon CloudFront, is a content delivery network that relies on the internet. Option D, an internet gateway, is explicitly designed to connect a VPC to the internet, which is against the stated requirement.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

A company wants to develop an educational game where users answer questions such as the following: “A jar contains six red, four green, and three yellow marbles. What is the probability of choosing a green marble from the jar?” Which solution meets these requirements with the LEAST operational overhead?
A. Use supervised learning to create a regression model that will predict probability.
B. Use reinforcement learning to train a model to return the probability.
C. Use code that will calculate probability by using simple rules and computations.
D. Use unsupervised learning to create a model that will estimate probability density.

A

C

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

A company is using a pre-trained large language model (LLM) to build a chatbot for product recommendations. The company needs the LLM outputs to be short and written in a specific language. Which solution will align the LLM response quality with the company’s expectations?
A. Adjust the prompt.
B. Choose an LLM of a different size.
C. Increase the temperature.
D. Increase the Top K value.

A

A

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

A company is using domain-specific models and wants to adapt pre-trained models to create models for new, related tasks, instead of creating new models from scratch. Which machine learning (ML) strategy best meets these requirements?

A. Increase the number of epochs.
B. Use transfer learning.
C. Decrease the number of epochs.
D. Use unsupervised learning.

A

B

The correct answer is B, Use transfer learning. Transfer learning leverages pre-trained models and adapts them for new, related tasks, directly addressing the company’s requirement to avoid building models from the ground up. Options A and C relate to the training process of a single model and don’t address the core issue of adapting existing models. Option D, unsupervised learning, involves training a model on unlabeled data, which is not relevant to adapting a pre-trained model for a new task.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

A company is building a solution to generate images for protective eyewear. The solution must have high accuracy and must minimize the risk of incorrect annotations. Which solution will meet these requirements?
A. Human-in-the-loop validation by using Amazon SageMaker Ground Truth Plus
B. Data augmentation by using an Amazon Bedrock knowledge base
C. Image recognition by using Amazon Rekognition
D. Data summarization by using Amazon QuickSight Q

A

A

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Which metric measures the runtime efficiency of operating AI models?
A. Customer satisfaction score (CSAT)
B. Training time for each epoch
C. Average response time
D. Number of training instances

A

C

The correct answer is C, Average response time. Average response time directly measures how long an AI model takes to process a request and return a result. This is crucial for evaluating runtime efficiency, especially in applications requiring quick responses.

Option A, Customer satisfaction score (CSAT), measures user satisfaction, not the model’s technical performance. Option B, Training time for each epoch, measures the time it takes to train the model, not its runtime efficiency during operation. Option D, Number of training instances, refers to the amount of data used for training, not the model’s operational speed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

A company is building a contact center application and wants to gain insights from customer conversations. The company wants to analyze and extract key information from the audio of the customer calls. Which solution meets these requirements?
A. Build a conversational chatbot by using Amazon Lex.
B. Transcribe call recordings by using Amazon Transcribe.
C. Extract information from call recordings by using Amazon SageMaker Model Monitor.
D. Create classification labels by using Amazon Comprehend.

A

B

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

A company wants to classify human genes into 20 categories based on gene characteristics. The company needs an ML algorithm to document how the inner mechanism of the model affects the output. Which ML algorithm meets these requirements?
A. Decision trees
B. Linear regression
C. Logistic regression
D. Neural networks

A

A. Decision trees
Decision trees are the most suitable algorithm because they offer high transparency and interpretability. The decision-making process is easily visualized by following the branches of the tree, directly showing how input features (gene characteristics) influence the output (gene category). This fulfills the requirement to document the model’s inner workings. Linear regression and logistic regression are unsuitable; linear regression is for predicting continuous values, not classifications, and logistic regression is primarily designed for binary classification (two categories), not the 20 categories required here. Neural networks, while capable of multi-class classification, are often considered “black boxes,” making it difficult to document how the inner mechanisms affect the output.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

A company uses a foundation model (FM) from Amazon Bedrock for an AI search tool. The company wants to fine-tune the model to be more accurate by using the company’s data. Which strategy will successfully fine-tune the model?
A. Provide labeled data with the prompt field and the completion field.
B. Prepare the training dataset by creating a .txt file that contains multiple lines in .csv format.
C. Purchase Provisioned Throughput for Amazon Bedrock.
D. Train the model on journals and textbooks.

A

A. Provide labeled data with the prompt field and the completion field.

The correct answer is A because fine-tuning a foundation model requires providing labeled data where each example consists of a prompt (input) and a completion (desired output). This allows the model to learn specific patterns and behaviors relevant to the company’s data and use case. Option B is incorrect because while a .txt file might be used, it must contain appropriately labeled data, not just raw .csv data. Option C is incorrect because while provisioned throughput might improve performance, it is not directly involved in the fine-tuning process itself. Option D is incorrect because training the model on general data (journals and textbooks) won’t tailor the model to the company’s specific needs; fine-tuning uses the company’s own data for this purpose.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Which feature of Amazon OpenSearch Service gives companies the ability to build vector database applications?
A. Integration with Amazon S3 for object storage
B. Support for geospatial indexing and queries
C. Scalable index management and nearest neighbor search capability
D. Ability to perform real-time analysis on streaming data

A

C

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Which option is a use case for generative AI models?
A. Improving network security by using intrusion detection systems
B. Creating photorealistic images from text descriptions for digital marketing
C. Enhancing database performance by using optimized indexing
D. Analyzing financial data to forecast stock market trends

A

B

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

A company wants to build a generative AI application using Amazon Bedrock and needs to choose a foundation model (FM). The company wants to know how much information can fit into one prompt. Which consideration will inform the company’s decision?
A. Temperature
B. Context window
C. Batch size
D. Model size

A

B

The correct answer is B, Context window. The context window refers to the maximum amount of text a foundation model can process in a single input (prompt). This directly addresses the company’s need to understand how much information can be included in a single prompt.

Option A, Temperature, is incorrect because it controls the randomness of the model’s output, not the input size. Option C, Batch size, refers to the number of inputs processed simultaneously, not the size of a single input. Option D, Model size, refers to the overall size of the model’s parameters, which indirectly relates to its capabilities but does not directly specify the maximum input length.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

A company wants to create a chatbot using a foundation model (FM) to help customers solve technical problems without human intervention. The chatbot’s responses must adhere to the company’s tone. Which solution best meets these requirements?

A. Set a low limit on the number of tokens the FM can produce.
B. Use batch inferencing to process detailed responses.
C. Experiment and refine the prompt until the FM produces the desired responses.
D. Define a higher number for the temperature parameter.

A

C

The correct answer is C because prompt engineering is the most effective way to control the tone and style of a foundation model’s output. By carefully crafting and iteratively refining the prompts given to the FM, the company can guide the chatbot to generate responses that align with their desired tone.

Option A is incorrect because limiting the number of tokens primarily affects the length of the response, not necessarily its tone. Option B is incorrect because batch inferencing is about efficiency in processing multiple requests, not about controlling the tone of individual responses. Option D is incorrect because increasing the temperature parameter generally leads to more creative, but potentially less coherent and less on-brand, responses.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
A company wants to use a large language model (LLM) on Amazon Bedrock for sentiment analysis. The company wants to classify the sentiment of text passages as positive or negative. Which prompt engineering strategy best meets these requirements? A. Provide examples of text passages with corresponding positive or negative labels in the prompt followed by the new text passage to be classified. B. Provide a detailed explanation of sentiment analysis and how LLMs work in the prompt. C. Provide the new text passage to be classified without any additional context or examples. D. Provide the new text passage with a few examples of unrelated tasks, such as text summarization or question answering.
A
26
A security company uses Amazon Bedrock to run foundation models (FMs). To ensure only authorized users invoke these models and to identify unauthorized access attempts for improved IAM policy and role creation, which AWS service should the company use? A. AWS Audit Manager B. AWS CloudTrail C. Amazon Fraud Detector D. AWS Trusted Advisor
B. AWS CloudTrail AWS CloudTrail is the correct answer because it logs all API calls made to AWS services, including Amazon Bedrock. This allows the company to track and identify any unauthorized access attempts. The other options are incorrect because: * **A. AWS Audit Manager:** While Audit Manager helps manage compliance, it doesn't directly track individual API calls and unauthorized access attempts in real-time like CloudTrail. * **C. Amazon Fraud Detector:** This service is designed to detect fraudulent activities, not specifically unauthorized access attempts to AWS services. * **D. AWS Trusted Advisor:** This service provides recommendations for improving security and cost optimization, but it doesn't provide the detailed logging necessary to pinpoint unauthorized access attempts.
27
An AI company periodically evaluates its systems and processes with the help of independent software vendors (ISVs). The company needs to receive email message notifications when an ISV's compliance reports become available. Which AWS service can the company use to meet this requirement? A. AWS Audit Manager B. AWS Artifact C. AWS Trusted Advisor D. AWS Data Exchange
B. AWS Artifact AWS Artifact allows access to AWS compliance reports and provides email notifications when new compliance documents are available. This directly addresses the company's need for notifications when ISV compliance reports are released. The other options are incorrect because: A (AWS Audit Manager) focuses on internal, not external, compliance; C (AWS Trusted Advisor) offers optimization recommendations, not ISV compliance report notifications; and D (AWS Data Exchange) is for data distribution, not compliance report management.
28
A company wants to use a large language model (LLM) to develop a conversational agent. The company needs to prevent the LLM from being manipulated with common prompt engineering techniques to perform undesirable actions or expose sensitive information. Which action will reduce these risks? A. Create a prompt template that teaches the LLM to detect attack patterns. B. Increase the temperature parameter on invocation requests to the LLM. C. Avoid using LLMs that are not listed in Amazon SageMaker. D. Decrease the number of input tokens on invocations of the LLM.
A The correct answer is A because creating a prompt template designed to identify and mitigate common attack patterns (like prompt injections) makes the LLM more resistant to manipulation. This proactive approach teaches the LLM to recognize and prevent malicious attempts. Option B is incorrect because increasing the temperature parameter generally increases randomness and creativity, making the LLM more susceptible to manipulation, not less. Option C is incorrect because restricting LLMs to those listed in Amazon SageMaker doesn't inherently protect against prompt engineering attacks. Option D is incorrect because decreasing the number of input tokens might limit the context the LLM can use but doesn't directly address the problem of malicious prompt engineering.
29
A company is using the Generative AI Security Scoping Matrix to assess security responsibilities for its solutions. The company has identified four different solution scopes based on the matrix. Which solution scope gives the company the MOST ownership of security responsibilities? A. Using a third-party enterprise application that has embedded generative AI features. B. Building an application by using an existing third-party generative AI foundation model (FM). C. Refining an existing third-party generative AI foundation model (FM) by fine-tuning the model by using data specific to the business. D. Building and training a generative AI model from scratch by using specific data that a customer owns.
D Explanation: Option D, building and training a generative AI model from scratch using specific data, provides the company with the most control and therefore the most responsibility for security. The company owns the data, the training process, the model architecture, and the deployment, giving them complete oversight of all security aspects. Options A, B, and C all involve using third-party components, reducing the company's control and thus their security responsibility. The less control a company has, the less ownership of security responsibilities they have.
30
An AI practitioner has a database of animal photos. The AI practitioner wants to automatically identify and categorize the animals in the photos without manual human effort. Which strategy meets these requirements? A. Object detection B. Anomaly detection C. Named entity recognition D. Inpainting
A. Object detection Object detection is the correct answer because it's a computer vision technique specifically designed to identify and classify objects within an image. The problem describes needing to identify and categorize animals in photos, which aligns perfectly with the functionality of object detection. B. Anomaly detection is incorrect because it focuses on identifying unusual or unexpected data points, not on classifying known objects like animals. C. Named entity recognition is incorrect because it's a natural language processing technique used to identify named entities (like people, places, organizations) in text, not images. D. Inpainting is incorrect because it's an image editing technique used to fill in missing or damaged parts of an image, not to identify and categorize objects.
31
A company wants to create an application using Amazon Bedrock. They have a limited budget and prefer flexibility without long-term commitment. Which Amazon Bedrock pricing model best meets these requirements? A. On-Demand B. Model customization C. Provisioned Throughput D. Spot Instance
A. On-Demand The On-Demand pricing model is the best fit because it allows the company to pay only for the resources consumed, offering flexibility and avoiding long-term commitments or upfront costs. This aligns perfectly with the company's limited budget and preference for flexibility. Option B (Model customization) is incorrect because it focuses on modifying the models, not the pricing structure. Option C (Provisioned Throughput) implies a pre-committed level of resources, which contradicts the requirement for flexibility. Option D (Spot Instance) is irrelevant to Amazon Bedrock pricing.
32
Which AWS service or feature can help an AI development team quickly deploy and consume a foundation model (FM) within the team's VPC? A. Amazon Personalize B. Amazon SageMaker JumpStart C. PartyRock, an Amazon Bedrock Playground D. Amazon SageMaker endpoints
B. Amazon SageMaker JumpStart Amazon SageMaker JumpStart provides pre-trained models and solutions, simplifying the process of deploying and using foundation models. Options A, C, and D are incorrect. Amazon Personalize is focused on personalization, not general foundation model deployment. PartyRock (presumably a hypothetical service or a misremembering of a service) is not a standard AWS offering. While Amazon SageMaker endpoints are where models are served, JumpStart helps with the initial deployment and setup to get the model to that endpoint within the VPC.
33
How can companies use large language models (LLMs) securely on Amazon Bedrock? A. Design clear and specific prompts. Configure AWS Identity and Access Management (IAM) roles and policies by using least privilege access. B. Enable AWS Audit Manager for automatic model evaluation jobs. C. Enable Amazon Bedrock automatic model evaluation jobs. D. Use Amazon CloudWatch Logs to make models explainable and to monitor for bias.
A
34
A company has terabytes of data in a database that the company can use for business analysis. The company wants to build an AI-based application that can build a SQL query from input text that employees provide. The employees have minimal experience with technology. Which solution meets these requirements? A. Generative pre-trained transformers (GPT) B. Residual neural network C. Support vector machine D. WaveNet
A The correct answer is A, Generative pre-trained transformers (GPT). GPT models are specifically designed for generating text, making them suitable for translating natural language (employee input) into a structured query language like SQL. Residual neural networks (B) are excellent for image and complex data processing but aren't inherently designed for text generation. Support vector machines (C) are primarily used for classification and regression tasks, not text generation. WaveNet (D) is designed for generating raw audio waveforms, making it irrelevant to this scenario.
35
An AI practitioner is building a model to generate images of humans in various professions. The AI practitioner discovered that the input data is biased and that specific attributes affect the image generation and create bias in the model. Which technique will solve this problem? A. Data augmentation for imbalanced classes B. Model monitoring for class distribution C. Retrieval Augmented Generation (RAG) D. Watermark detection for images
A. Data augmentation for imbalanced classes Data augmentation is the correct answer because it directly addresses the problem of biased input data. If certain professions are underrepresented in the dataset (leading to bias), data augmentation techniques can create synthetic data to balance the classes and make the model more representative. B is incorrect because while model monitoring can *identify* bias, it doesn't solve it. C is irrelevant; RAG is a technique for improving text generation, not image generation or bias mitigation. D is also irrelevant; watermark detection is unrelated to addressing bias in image generation.
36
A medical company is customizing a foundation model (FM) for diagnostic purposes. The company needs the model to be transparent and explainable to meet regulatory requirements. Which solution will meet these requirements? A. Configure the security and compliance by using Amazon Inspector. B. Generate simple metrics, reports, and examples by using Amazon SageMaker Clarify. C. Encrypt and secure training data by using Amazon Macie. D. Gather more data. Use Amazon Rekognition to add custom labels to the data.
B
37
A company wants to deploy a conversational chatbot to answer customer questions. The chatbot is based on a fine-tuned Amazon SageMaker JumpStart model. The application must comply with multiple regulatory frameworks. Which capabilities can the company show compliance for? (Choose two.) A. Auto scaling inference endpoints B. Threat detection C. Data protection D. Cost optimization E. Loosely coupled microservices
B, C The correct answers are B (Threat detection) and C (Data protection). Threat detection is crucial for regulatory compliance as it protects against cyberattacks, data breaches, and other vulnerabilities that could compromise the chatbot's integrity and violate data security regulations. Data protection is essential for compliance as it ensures the privacy and security of customer data collected and processed by the chatbot, adhering to regulations like GDPR, CCPA, etc. Option A (Auto scaling inference endpoints) is an operational capability, not directly related to regulatory compliance. Option D (Cost optimization) is a business concern, not a compliance feature. Option E (Loosely coupled microservices) is an architectural pattern that can indirectly improve compliance by enhancing system resilience and security, but it's not a direct compliance capability.
38
Which functionality does Amazon SageMaker Clarify provide? A. Integrates a Retrieval Augmented Generation (RAG) workflow B. Monitors the quality of ML models in production C. Documents critical details about ML models D. Identifies potential bias during data preparation
D Amazon SageMaker Clarify's primary function is to identify potential bias in data used for machine learning, both before and after model training. Options A, B, and C describe functionalities of other Amazon SageMaker services: A (RAG workflow integration) is not a Clarify feature; B (model quality monitoring in production) is handled by SageMaker Model Monitor; and C (model documentation) is the role of SageMaker Model Cards.
39
A social media company wants to use a large language model (LLM) for content moderation. The company wants to evaluate the LLM outputs for bias and potential discrimination against specific groups or individuals. Which data source should the company use to evaluate the LLM outputs with the LEAST administrative effort? A. User-generated content B. Moderation logs C. Content moderation guidelines D. Benchmark datasets
D
40
A loan company is building a generative AI-based solution to offer new applicants discounts based on specific business criteria. The company wants to build and use an AI model responsibly to minimize bias that could negatively affect some customers. Which actions should the company take to meet these requirements? (Choose two.) A. Detect imbalances or disparities in the data. B. Ensure that the model runs frequently. C. Evaluate the model's behavior so that the company can provide transparency to stakeholders. D. Use the Recall-Oriented Understudy for Gisting Evaluation (ROUGE) technique to ensure that the model is 100% accurate. E. Ensure that the model's inference time is within the accepted limits.
A, C
41
A company uses an Amazon Bedrock base model to summarize documents internally. They trained a custom model to improve summarization quality. Which action is required to use this custom model through Amazon Bedrock? A. Purchase Provisioned Throughput for the custom model. B. Deploy the custom model in an Amazon SageMaker endpoint for real-time inference. C. Register the model with the Amazon SageMaker Model Registry. D. Grant access to the custom model in Amazon Bedrock.
A
42
A company needs to build its own large language model (LLM) based only on its private data. The company is concerned about the environmental effect of the training process. Which Amazon EC2 instance type has the LEAST environmental effect when training LLMs? A. Amazon EC2 C series B. Amazon EC2 G series C. Amazon EC2 P series D. Amazon EC2 Trn series
D. Amazon EC2 Trn series The Trn series instances are specifically designed for training deep learning models and are optimized for energy efficiency. They utilize specialized AWS-designed Trainium chips, resulting in a better performance-to-energy ratio compared to other instance types. This makes them the most environmentally friendly option for LLM training among the choices provided. The other options (C, G, and P series) are not specifically designed for this purpose and thus will likely have a higher energy consumption per unit of computation.
43
A company is building an application that needs to generate synthetic data that is based on existing data. Which type of model can the company use to meet this requirement? A. Generative adversarial network (GAN) B. XGBoost C. Residual neural network D. WaveNet
A Generative adversarial networks (GANs) are specifically designed to generate new data instances that resemble the training data. They achieve this through a competition between two neural networks: a generator that creates synthetic data and a discriminator that tries to distinguish between real and synthetic data. This adversarial process leads to the generator producing increasingly realistic synthetic data. XGBoost is a gradient boosting algorithm used for prediction tasks, not data generation. Residual neural networks are used for various tasks, including image classification and object detection, but not primarily for synthetic data generation. WaveNet is a deep generative model for raw audio synthesis, not for general-purpose synthetic data generation across different data types.
44
An ecommerce company wants to build a solution to determine customer sentiments based on written customer reviews of products. Which AWS services meet these requirements? (Choose two.) A. Amazon Lex B. Amazon Comprehend C. Amazon Polly D. Amazon Bedrock E. Amazon Rekognition
B and D Amazon Comprehend is a natural language processing (NLP) service that excels at analyzing text and extracting insights, including sentiment. It directly addresses the need to determine customer sentiment from written reviews. Amazon Bedrock provides access to foundation models capable of sentiment analysis via generative AI. This offers another powerful approach to analyzing customer reviews and understanding sentiment. Amazon Lex (A) is a service for building conversational interfaces, not directly relevant to sentiment analysis from text. Amazon Polly (C) is a text-to-speech service, unrelated to sentiment analysis. Amazon Rekognition (E) is for image and video analysis, not text.
45
A company is building an ML model. The company collected new data and analyzed the data by creating a correlation matrix, calculating statistics, and visualizing the data. Which stage of the ML pipeline is the company currently in? A. Data pre-processing B. Feature engineering C. Exploratory data analysis D. Hyperparameter tuning
C. Exploratory data analysis Exploratory data analysis (EDA) involves analyzing and visualizing data to understand its characteristics, identify patterns, and find anomalies. Creating correlation matrices, calculating statistics, and visualizing data are all common EDA tasks. The other options are incorrect because: A) Data pre-processing focuses on cleaning and preparing data for modeling; B) Feature engineering involves creating new features from existing ones; and D) Hyperparameter tuning optimizes model parameters after the model is built. The company's actions clearly indicate they are exploring and understanding the data, which is the defining characteristic of EDA.
46
A company has documents that are missing some words because of a database error. The company wants to build an ML model that can suggest potential words to fill in the missing text. Which type of model meets this requirement? A. Topic modeling B. Clustering models C. Prescriptive ML models D. BERT-based models
D. BERT-based models The correct answer is D because BERT (Bidirectional Encoder Representations from Transformers) is a language model specifically designed for understanding context in text. It considers both the surrounding words to predict masked words, making it highly effective at filling in missing words within sentences. Options A, B, and C are incorrect because: * **A. Topic modeling:** Focuses on identifying themes and topics within a collection of documents, not on filling in missing words within individual documents. * **B. Clustering models:** Group similar data points together; this is not directly relevant to predicting missing words based on context. * **C. Prescriptive ML models:** Suggest actions to optimize an outcome, not suitable for filling in missing text.
47
An AI practitioner has built a deep learning model to classify the types of materials in images. Which metric will help the AI practitioner evaluate the performance of the model? A. Confusion matrix B. Correlation matrix C. R2 score D. Mean squared error (MSE)
A The correct answer is A, Confusion matrix. A confusion matrix is specifically designed to evaluate the performance of classification models. It provides a detailed breakdown of the model's predictions, including true positives, true negatives, false positives, and false negatives. This allows for the calculation of other important metrics like accuracy, precision, recall, and F1-score, which are all crucial for assessing the performance of a classification model. Options B, C, and D are incorrect because they are more suitable for regression problems, not classification problems. A correlation matrix measures the linear relationship between variables, R2 score represents the proportion of variance in the dependent variable explained by the independent variable(s), and MSE measures the average squared difference between predicted and actual values – all relevant to regression tasks, not classification.
48
A company has built a chatbot that responds to natural language questions with images. To prevent the chatbot from returning inappropriate or unwanted images, which solution is most effective? A. Implement moderation APIs. B. Retrain the model with a general public dataset. C. Perform model validation. D. Automate user feedback integration.
A The correct answer is A because moderation APIs directly filter image content before it's displayed, blocking inappropriate material. Option B is incorrect because retraining with a general public dataset doesn't guarantee the removal of inappropriate content; it may even introduce more. Option C (model validation) is a good practice but doesn't actively prevent inappropriate images from being shown. Option D (automating user feedback) is reactive, not proactive, and doesn't prevent inappropriate images from being shown in the first place.
49
A company wants to use a large language model (LLM) on Amazon Bedrock for sentiment analysis. The company needs the LLM to produce more consistent responses to the same input prompt. Which adjustment to an inference parameter should the company make to meet these requirements? A. Decrease the temperature value. B. Increase the temperature value. C. Decrease the length of output tokens. D. Increase the maximum generation length.
A The correct answer is A because decreasing the temperature value reduces the randomness in the LLM's output, leading to more consistent responses for the same input. Increasing the temperature (B) would increase randomness and variability. Decreasing the length of output tokens (C) and increasing the maximum generation length (D) affect the length of the response, not its consistency.
50
A company built a deep learning model for object detection and deployed the model to production. Which AI process occurs when the model analyzes a new image to identify objects? A. Training B. Inference C. Model deployment D. Bias correction
B. Inference
51
A company is training a foundation model (FM) and wants to increase the model's accuracy to a specific acceptance level. Which solution will meet these requirements? A. Decrease the batch size. B. Increase the epochs. C. Decrease the epochs. D. Increase the temperature parameter.
B. Increase the epochs. Increasing the number of epochs allows the model to train on the data for a longer period, potentially improving accuracy by learning more complex patterns. Decreasing the batch size (A) might affect training stability but doesn't guarantee increased accuracy. Decreasing the epochs (C) reduces training time but likely decreases accuracy. Increasing the temperature parameter (D) affects the model's output randomness during inference, not training accuracy.
52
A company is building a large language model (LLM) question answering chatbot to decrease the number of actions call center employees need to take to respond to customer questions. Which business objective should the company use to evaluate the effect of the LLM chatbot? A. Website engagement rate B. Average call duration C. Corporate social responsibility D. Regulatory compliance
B
53
A company is developing a new model to predict the prices of specific items. The model performed well on the training dataset. When the company deployed the model to production, the model's performance decreased significantly. What should the company do to mitigate this problem? A. Reduce the volume of data that is used in training. B. Add hyperparameters to the model. C. Increase the volume of data that is used in training. D. Increase the model training time.
C. Increase the volume of data that is used in training. The significant decrease in model performance after deployment suggests the model is overfitting the training data. Overfitting occurs when a model learns the training data too well, including its noise and outliers, and fails to generalize to new, unseen data. Increasing the volume of training data is a common solution to this problem. More data helps the model learn the underlying patterns more effectively and reduces the impact of noise, leading to better generalization and improved performance on production data. Option A is incorrect because reducing the training data would likely exacerbate the overfitting problem. Option B is incorrect because adding hyperparameters without addressing the underlying issue of overfitting is unlikely to solve the problem. While hyperparameter tuning is important for model optimization, it's not the primary solution in this case of a significant performance drop after deployment. Option D is incorrect because increasing training time without addressing the overfitting issue might lead to even more overfitting, not improved generalization.
54
A company wants to use large language models (LLMs) with Amazon Bedrock to develop a chat interface for its product manuals, which are stored as PDF files. Which solution offers the MOST cost-effective approach? A. Use prompt engineering to add one PDF file as context to the user prompt when the prompt is submitted to Amazon Bedrock. B. Use prompt engineering to add all the PDF files as context to the user prompt when the prompt is submitted to Amazon Bedrock. C. Use all the PDF documents to fine-tune a model with Amazon Bedrock. Use the fine-tuned model to process user prompts. D. Upload PDF documents to an Amazon Bedrock knowledge base. Use the knowledge base to provide context when users submit prompts to Amazon Bedrock.
D Option D is the most cost-effective because using an Amazon Bedrock knowledge base is designed for efficient context retrieval from large document sets. Options A and B are impractical for large numbers of PDF files due to context window limitations of LLMs; trying to include all PDFs (B) would likely be impossible or prohibitively expensive. Option C, fine-tuning, is generally more expensive than using a knowledge base because it requires significant computational resources and time. The knowledge base approach allows for efficient querying and retrieval of relevant information from the PDF documents without the need for extensive prompt engineering or costly model fine-tuning.
55
A digital devices company wants to predict customer demand for memory hardware. The company does not have coding experience or knowledge of ML algorithms and needs to develop a data-driven predictive model. The company needs to perform analysis on internal data and external data. Which solution will meet these requirements? A. Store the data in Amazon S3. Create ML models and demand forecast predictions by using Amazon SageMaker built-in algorithms that use the data from Amazon S3. B. Import the data into Amazon SageMaker Data Wrangler. Create ML models and demand forecast predictions by using SageMaker built-in algorithms. C. Import the data into Amazon SageMaker Data Wrangler. Build ML models and demand forecast predictions by using an Amazon Personalize Trending-Now recipe. D. Import the data into Amazon SageMaker Canvas. Build ML models and demand forecast predictions by selecting the values in the data from SageMaker Canvas.
D The correct answer is D because Amazon SageMaker Canvas is a no-code/low-code machine learning service. This directly addresses the company's lack of coding experience and need for a data-driven predictive model. Options A, B, and C all require some level of coding or familiarity with ML algorithms, making them unsuitable for this company.
56
A research company implemented a chatbot using a foundation model (FM) from Amazon Bedrock. This chatbot searches for answers to questions from a large database of research papers. After multiple prompt engineering attempts, the company notices that the FM is performing poorly because of the complex scientific terms in the research papers. How can the company improve the performance of the chatbot? A. Use few-shot prompting to define how the FM can answer the questions. B. Use domain adaptation fine-tuning to adapt the FM to complex scientific terms. C. Change the FM inference parameters. D. Clean the research paper data to remove complex scientific terms.
B
57
A company wants to develop a large language model (LLM) application using Amazon Bedrock and customer data stored in Amazon S3. Their security policy mandates that each team can only access data for their own customers. Which solution best meets these requirements? A. Create an Amazon Bedrock custom service role for each team, granting access only to that team's customer data. B. Create a custom service role with Amazon S3 access. Require teams to specify the customer name on each Amazon Bedrock request. C. Redact personal data in Amazon S3 and update the S3 bucket policy to allow team access to customer data. D. Create one Amazon Bedrock role with full Amazon S3 access. Create IAM roles for each team, granting access only to their respective customer folders.
A The correct answer is A because it directly addresses the security requirement by isolating access at the role level. Each team gets its own role with restricted access to only its designated customer data within S3. This provides the strongest isolation and minimizes the risk of unintended data access. Option B relies on application-level controls (specifying customer names in requests), which is less secure and more prone to errors. Option C, while addressing data privacy through redaction, doesn't fully solve the access control problem; teams still need separate permissions to control which redacted data they can access. Option D creates a central point of access (the Bedrock role with full S3 access), creating a single point of failure and increasing the risk of a security breach impacting multiple teams.
58
A medical company deployed a disease detection model on Amazon Bedrock. To comply with privacy policies, the company wants to prevent the model from including personal patient information in its responses. The company also wants to receive notification when policy violations occur. Which solution meets these requirements? A. Use Amazon Macie to scan the model's output for sensitive data and set up alerts for potential violations. B. Configure AWS CloudTrail to monitor the model's responses and create alerts for any detected personal information. C. Use Guardrails for Amazon Bedrock to filter content. Set up Amazon CloudWatch alarms for notification of policy violations. D. Implement Amazon SageMaker Model Monitor to detect data drift and receive alerts when model quality degrades.
C The correct answer is C because Guardrails for Amazon Bedrock directly addresses the need to filter content and prevent the inclusion of personal information in the model's responses. Amazon CloudWatch alarms provide the necessary notification system for policy violations. Option A is incorrect because while Amazon Macie can scan for sensitive data, it doesn't directly integrate with Amazon Bedrock's content generation process to prevent the inclusion of sensitive information *before* it's generated. Option B is incorrect because AWS CloudTrail is primarily for logging and monitoring API calls and activity within AWS, not for filtering content generated by a model. Option D is incorrect because Amazon SageMaker Model Monitor focuses on model quality and data drift, not on preventing the generation of responses containing sensitive information.
59
An education provider is building a question and answer application that uses a generative AI model to explain complex concepts. The education provider wants to automatically change the style of the model response depending on who is asking the question. The education provider will give the model the age range of the user who has asked the question. Which solution meets these requirements with the LEAST implementation effort? A. Fine-tune the model by using additional training data that is representative of the various age ranges that the application will support. B. Add a role description to the prompt context that instructs the model of the age range that the response should target. C. Use chain-of-thought reasoning to deduce the correct style and complexity for a response suitable for that user. D. Summarize the response text depending on the age of the user so that younger users receive shorter responses.
B
60
An accounting firm wants to implement a large language model (LLM) to automate document processing. The firm must proceed responsibly to avoid potential harms. What should the firm do when developing and deploying the LLM? (Choose two.) A. Include fairness metrics for model evaluation. B. Adjust the temperature parameter of the model. C. Modify the training data to mitigate bias. D. Avoid overfitting on the training data. E. Apply prompt engineering techniques.
A, C
61
A company wants to build an interactive application for children that generates new stories based on classic stories. The company wants to use Amazon Bedrock and needs to ensure that the results and topics are appropriate for children. Which AWS service or feature will meet these requirements? A. Amazon Rekognition B. Amazon Bedrock playgrounds C. Guardrails for Amazon Bedrock D. Agents for Amazon Bedrock
C. Guardrails for Amazon Bedrock Guardrails for Amazon Bedrock are safety mechanisms that can be applied to ensure content generated by foundation models adheres to specific safety and appropriateness guidelines. This makes it the ideal choice to ensure the generated stories are suitable for children and avoid inappropriate content. Amazon Rekognition is an image and video analysis service, not relevant to text generation. Amazon Bedrock playgrounds are environments for experimenting with models, not content filtering tools. Agents for Amazon Bedrock are for building conversational AI applications and don't directly address content appropriateness.
62
A company is implementing the Amazon Titan foundation model (FM) by using Amazon Bedrock. The company needs to supplement the model by using relevant data from the company's private data sources. Which solution will meet this requirement? A. Use a different FM. B. Choose a lower temperature value. C. Create an Amazon Bedrock knowledge base. D. Enable model invocation logging.
C
63
A company has developed an ML model for image classification. The company wants to deploy the model to production so that a web application can use the model. The company needs to implement a solution to host the model and serve predictions without managing any of the underlying infrastructure. Which solution will meet these requirements? A. Use Amazon SageMaker Serverless Inference to deploy the model. B. Use Amazon CloudFront to deploy the model. C. Use Amazon API Gateway to host the model and serve predictions. D. Use AWS Batch to host the model and serve predictions.
A Amazon SageMaker Serverless Inference is a fully managed service designed for deploying and serving machine learning models without requiring the user to manage underlying infrastructure. This directly addresses the company's requirement of deploying the model and serving predictions without managing infrastructure. Option B is incorrect because Amazon CloudFront is a content delivery network (CDN) and is not designed for hosting and serving ML models. Option C is incorrect because while Amazon API Gateway can be used to create an API to access a model, it doesn't inherently host or manage the model itself; it requires integration with a separate hosting service. Option D is incorrect because AWS Batch is a batch processing service, not suited for real-time prediction serving required by a web application.
64
A company is building an ML model to analyze archived data. The company must perform inference on large datasets that are multiple GBs in size. The company does not need to access the model predictions immediately. Which Amazon SageMaker inference option will meet these requirements? A. Batch transform B. Real-time inference C. Serverless inference D. Asynchronous inference
A. Batch transform Batch transform is the most suitable option because it is designed for processing large datasets (multiple GBs) without the need for immediate predictions. Real-time inference requires immediate responses, which is not a requirement here. Serverless inference, while scalable, is still geared towards individual requests rather than bulk processing. Asynchronous inference, while handling large requests, still involves individual endpoint calls, making batch transform a more efficient choice for this large-scale, archived data processing scenario.
65
A company has installed a security camera and uses a machine learning (ML) model to analyze footage for potential thefts. The model disproportionately flags people of a specific ethnic group. Which type of bias is affecting the model output? A. Measurement bias B. Sampling bias C. Observer bias D. Confirmation bias
B. Sampling bias The correct answer is B because sampling bias occurs when the data used to train the model does not accurately represent the real-world population. If the training dataset contains more examples of people from a specific ethnic group associated with suspicious activities, the model learns skewed patterns and disproportionately applies those associations, leading to unfair discrimination. A is incorrect because measurement bias refers to systematic errors in the measurement process itself, not the composition of the data used for training. C is incorrect because observer bias relates to the subjective interpretation of data by a human observer, which is not the case here as the analysis is performed by an ML model. D is incorrect because confirmation bias describes the tendency to search for or interpret information in a way that confirms pre-existing beliefs, which is not directly related to the disproportionate flagging of a specific ethnic group by the model.
66
An AI practitioner is using an Amazon Bedrock base model to summarize session chats from the customer service department. The AI practitioner wants to store invocation logs to monitor model input and output data. Which strategy should the AI practitioner use? A. Configure AWS CloudTrail as the logs destination for the model. B. Enable invocation logging in Amazon Bedrock. C. Configure AWS Audit Manager as the logs destination for the model. D. Configure model invocation logging in Amazon EventBridge.
B The correct answer is B because Amazon Bedrock has a built-in feature to enable invocation logging. This directly addresses the practitioner's need to store and monitor input and output data from the model's invocations. Option A is incorrect because AWS CloudTrail is a service for logging API calls and management events, not specifically for model invocation details within Amazon Bedrock. Option C is incorrect because AWS Audit Manager focuses on compliance and auditing, not detailed model invocation logging. Option D is incorrect because while EventBridge can handle logs, it's not the primary or most direct method for managing Amazon Bedrock's invocation logs; Bedrock offers this functionality natively.
67
Which strategy evaluates the accuracy of a foundation model (FM) that is used in image classification tasks? A. Calculate the total cost of resources used by the model. B. Measure the model's accuracy against a predefined benchmark dataset. C. Count the number of layers in the neural network. D. Assess the color accuracy of images processed by the model.
B
68
A company wants to display the total sales for its top-selling products across various retail locations in the past 12 months. Which AWS solution should the company use to automate the generation of graphs? A. Amazon Q in Amazon EC2 B. Amazon Q Developer C. Amazon Q in Amazon QuickSight D. Amazon Q in AWS Chatbot
C. Amazon Q in Amazon QuickSight Amazon QuickSight is a fully managed business intelligence service that allows users to easily create and publish interactive dashboards and visualizations, including graphs. Amazon Q, integrated within QuickSight, enables natural language querying to automate the generation of these visualizations based on the user's questions about their data. This makes it the ideal solution for automating the creation of graphs showing total sales data. Option A is incorrect because Amazon EC2 is a compute service; it doesn't offer built-in business intelligence or graph generation capabilities. Option B is incorrect because Amazon Q Developer is focused on integrating Amazon Q into custom applications, not directly generating visualizations. Option D is incorrect because AWS Chatbot is a conversational interface, not a business intelligence tool capable of generating complex graphs from sales data.
69
A company is using few-shot prompting on a base model hosted on Amazon Bedrock. The model currently uses 10 examples in the prompt and is invoked once daily, performing well. The company wants to lower the monthly cost. Which solution will meet these requirements? A. Customize the model by using fine-tuning. B. Decrease the number of tokens in the prompt. C. Increase the number of tokens in the prompt. D. Use Provisioned Throughput.
B
70
A company wants to use a pre-trained generative AI model to generate content for its marketing campaigns. The company needs to ensure that the generated content aligns with the company's brand voice and messaging requirements. Which solution meets these requirements? A. Optimize the model's architecture and hyperparameters to improve the model's overall performance. B. Increase the model's complexity by adding more layers to the model's architecture. C. Create effective prompts that provide clear instructions and context to guide the model's generation. D. Select a large, diverse dataset to pre-train a new generative model.
C The correct answer is C because using effective prompts allows direct control over the generated content's style and message. Options A and B involve altering the pre-trained model itself, which is unnecessary and potentially problematic. Option D suggests training a new model, which is inefficient and defeats the purpose of using a pre-trained model. Only option C directly addresses the need to align the output with the company's brand voice and messaging using readily available tools.
71
A company needs to choose an Amazon Bedrock model for internal use. The company's primary requirement is to select a model that generates responses in a style preferred by its employees. What is the best approach to meet this requirement? A. Evaluate the models by using built-in prompt datasets. B. Evaluate the models by using a human workforce and custom prompt datasets. C. Use public model leaderboards to identify the model. D. Use the model InvocationLatency runtime metrics in Amazon CloudWatch when trying models.
B The correct answer is B because it directly addresses the company's need to assess the models based on employee preferences. Using a human workforce allows for subjective evaluation of response style, and custom prompt datasets ensure that the models are evaluated on prompts relevant to the company's use case. Option A is incorrect because built-in datasets may not reflect the company's specific needs and style preferences. Option C is incorrect because public leaderboards focus on general performance metrics, not stylistic preferences. Option D is incorrect because it focuses on performance metrics (latency) rather than the qualitative aspect of response style.
72
A company manually reviews all submitted resumes in PDF format. As the company grows, the company expects the volume of resumes to exceed the company's review capacity. The company needs an automated system to convert the PDF resumes into plain text format for additional processing. Which AWS service meets this requirement? A. Amazon Textract B. Amazon Personalize C. Amazon Lex D. Amazon Transcribe
A. Amazon Textract Amazon Textract is the correct answer because it is an AWS service specifically designed to extract text and data from documents, including PDFs. This directly addresses the company's need to automate the conversion of PDF resumes into plain text. Option B, Amazon Personalize, is incorrect because it's a recommendation engine, not a document processing service. Option C, Amazon Lex, is incorrect because it's a service for building conversational interfaces, not for document conversion. Option D, Amazon Transcribe, is incorrect because it converts speech to text, not documents to text.
73
A company is building a chatbot using Amazon Bedrock's large language model (LLM) for intent detection. They want to improve accuracy using few-shot learning. Which additional data is needed? A. Pairs of chatbot responses and correct user intents B. Pairs of user messages and correct chatbot responses C. Pairs of user messages and correct user intents D. Pairs of user intents and correct chatbot responses
C
74
A large retailer receives thousands of customer support inquiries daily and needs to process them quickly. They are considering implementing Amazon Bedrock Agents. Which key benefit of Amazon Bedrock Agents would be MOST helpful in this scenario? A. Generation of custom foundation models (FMs) to predict customer needs B. Automation of repetitive tasks and orchestration of complex workflows C. Automatically calling multiple foundation models (FMs) and consolidating the results D. Selecting the foundation model (FM) based on predefined criteria and metrics
B The correct answer is B because Amazon Bedrock Agents excel at automating repetitive tasks and orchestrating complex workflows. Customer support often involves many similar inquiries, making automation ideal for efficiency. While options A, C, and D are features of Bedrock, they aren't the most directly beneficial for rapidly processing a high volume of customer support requests. A focuses on prediction, C on model consolidation, and D on model selection; all secondary to the core need of efficient task automation.
75
What are tokens in the context of generative AI models? A. Tokens are the basic units of input and output that a generative AI model operates on, representing words, subwords, or other linguistic units. B. Tokens are the mathematical representations of words or concepts used in generative AI models. C. Tokens are the pre-trained weights of a generative AI model that are fine-tuned for specific tasks. D. Tokens are the specific prompts or instructions given to a generative AI model to generate output.
A
76
A company is using Amazon Bedrock to build generative AI applications. Which factor will primarily drive the inference costs associated with using a large language model (LLM) to generate inferences? A. Number of tokens consumed B. Temperature value C. Amount of data used to train the LLM D. Total training time
A The correct answer is A because the number of tokens consumed directly impacts the computational resources required for inference. More tokens mean more processing, resulting in higher costs. Options B, C, and D relate to the training of the LLM, not the inference costs incurred during its use. Temperature value affects the randomness of the model's output, while the amount of training data and training time impact the model's development cost, not its runtime inference cost.
77
A company has a foundation model (FM) that was customized by using Amazon Bedrock to answer customer queries about products. The company wants to validate the model's responses to new types of queries. The company needs to upload a new dataset that Amazon Bedrock can use for validation. Which AWS service meets these requirements? A. Amazon S3 B. Amazon Elastic Block Store (Amazon EBS) C. Amazon Elastic File System (Amazon EFS) D. AWS Snowcone
A. Amazon S3 Amazon S3 is the correct answer because it's a scalable, durable, and secure object storage service designed for storing large datasets. Amazon Bedrock can easily access data stored in S3 buckets for model validation. Option B (Amazon EBS) is incorrect because it is a block storage service primarily used for persistent storage for EC2 instances, not for general-purpose data storage and retrieval required by Bedrock. Option C (Amazon EFS) is incorrect as it's a file system designed for shared file access, not optimized for the large-scale data handling needed for model validation. Option D (AWS Snowcone) is incorrect because it's a physical device for transferring large datasets, not a service for online storage and retrieval.
78
A student at a university is copying content from generative AI to write essays. Which challenge of responsible generative AI does this scenario represent? A. Toxicity B. Hallucinations C. Plagiarism D. Privacy
C. Plagiarism The correct answer is C because the student is directly copying AI-generated content and submitting it as their own work without proper attribution. This is the definition of plagiarism. Option A, Toxicity, refers to AI generating harmful or offensive content. Option B, Hallucinations, refers to AI generating factually incorrect information. Option D, Privacy, refers to the potential misuse of personal data by AI systems. None of these options directly address the issue of academic dishonesty presented in the scenario.
79
Which term describes the numerical representations of real-world objects and concepts that AI and natural language processing (NLP) models use to improve understanding of textual information? A. Embeddings B. Tokens C. Models D. Binaries
A. Embeddings Embeddings are the correct answer because they are specifically designed as numerical representations that capture the semantic meaning of words, phrases, or documents. This allows AI and NLP models to understand the relationships between these concepts and improve their comprehension of text. Option B, Tokens, is incorrect because tokens are simply individual units of text (words, punctuation, etc.) and don't inherently represent semantic meaning. Option C, Models, is too broad; models use embeddings, but are not the embeddings themselves. Option D, Binaries, is incorrect as it refers to a data type (0s and 1s) and not a semantic representation.
80
An AI practitioner is using a large language model (LLM) to create content for marketing campaigns. The generated content sounds plausible and factual but is incorrect. Which problem is the LLM having? A. Data leakage B. Hallucination C. Overfitting D. Underfitting
B. Hallucination Hallucination is the correct answer because it describes the LLM generating content that appears realistic but is factually inaccurate. Data leakage refers to the unintended exposure of training data, overfitting means the model performs well on training data but poorly on unseen data, and underfitting means the model is too simple to capture the underlying patterns in the data. None of these accurately describe an LLM generating plausible but incorrect information.
81
A company is building a customer service chatbot. The company wants the chatbot to improve its responses by learning from past interactions and online resources. Which AI learning strategy provides this self-improvement capability? A. Supervised learning with a manually curated dataset of good responses and bad responses B. Reinforcement learning with rewards for positive customer feedback C. Unsupervised learning to find clusters of similar customer inquiries D. Supervised learning with a continuously updated FAQ database
B Reinforcement learning is the correct answer because it allows the chatbot to learn from its interactions with customers and adjust its responses based on positive feedback. This directly addresses the requirement for self-improvement based on past interactions. Options A and D rely on pre-defined datasets and therefore lack the continuous self-improvement capability. Option C, unsupervised learning, focuses on identifying patterns in data rather than improving responses based on feedback.
82
An AI practitioner trained a custom model on Amazon Bedrock by using a training dataset that contains confidential data. The AI practitioner wants to ensure that the custom model does not generate inference responses based on confidential data. How should the AI practitioner prevent responses based on confidential data? A. Delete the custom model. Remove the confidential data from the training dataset. Retrain the custom model. B. Mask the confidential data in the inference responses by using dynamic data masking. C. Encrypt the confidential data in the inference responses by using Amazon SageMaker. D. Encrypt the confidential data in the custom model by using AWS Key Management Service (AWS KMS).
A The correct answer is A because the model may have memorized the confidential data during training. Deleting the model removes the risk of the model directly outputting the confidential information. Removing the confidential data from the training dataset and retraining prevents the new model from learning and subsequently using the sensitive information. Options B, C, and D attempt to mitigate the issue after the fact, but they do not address the root problem of the model having already learned the confidential data. They also do not guarantee that the confidential data won't still be indirectly reflected in the model's output in other ways.
83
Which option is a benefit of ongoing pre-training when fine-tuning a foundation model (FM)? A. Helps decrease the model's complexity B. Improves model performance over time C. Decreases the training time requirement D. Optimizes model inference time
B. Improves model performance over time Ongoing pre-training allows the foundation model to learn from and adapt to new data continuously. This results in improved performance and accuracy over time as the model becomes better at handling various tasks and contexts. Option A is incorrect because ongoing pre-training typically increases the model's knowledge and thus its complexity, not decreases it. Option C is incorrect because pre-training adds to the overall training time; it doesn't decrease it. Option D is incorrect because ongoing pre-training primarily affects the model's accuracy and adaptability, not its inference speed.
84
A company has built a solution using generative AI and large language models (LLMs) to translate training manuals from English into other languages. The company wants to evaluate the accuracy of the translated manuals. Which model evaluation strategy best meets these requirements? A. Bilingual Evaluation Understudy (BLEU) B. Root mean squared error (RMSE) C. Recall-Oriented Understudy for Gisting Evaluation (ROUGE) D. F1 score
A. Bilingual Evaluation Understudy (BLEU) BLEU is the correct answer because it is specifically designed to evaluate the quality of machine-translated text by comparing it to human-created reference translations. This directly addresses the company's need to assess the accuracy of its AI-powered translation solution. RMSE is used for regression models, not text translation. ROUGE is used for text summarization, not translation. The F1 score is used for classification problems. These metrics are therefore inappropriate for evaluating the accuracy of translations.
85
A company is using Amazon SageMaker Studio notebooks to build and train ML models. The company stores the data in an Amazon S3 bucket. The company needs to manage the flow of data from Amazon S3 to SageMaker Studio notebooks. Which solution will meet this requirement? A. Use Amazon Inspector to monitor SageMaker Studio. B. Use Amazon Macie to monitor SageMaker Studio. C. Configure SageMaker to use a VPC with an S3 endpoint. D. Configure SageMaker to use S3 Glacier Deep Archive.
C. Configure SageMaker to use a VPC with an S3 endpoint. This is the correct answer because using a VPC with an S3 endpoint creates a private connection between SageMaker Studio and the S3 bucket. This ensures secure and efficient data transfer within the AWS network, avoiding the public internet and improving both security and performance. Option A is incorrect because Amazon Inspector is for assessing the security vulnerabilities of EC2 instances and not for managing data flow between S3 and SageMaker. Option B is incorrect because Amazon Macie is a data loss prevention service, not directly related to managing data transfer between S3 and SageMaker. Option D is incorrect because S3 Glacier Deep Archive is for archiving data, not for actively accessing and using data for model training in SageMaker. Data retrieval from Glacier Deep Archive is significantly slower than accessing data from a standard S3 bucket, making it unsuitable for this use case.
86
A social media company wants to use a large language model (LLM) to summarize messages. The company has chosen a few LLMs available on Amazon SageMaker JumpStart and wants to compare the generated output toxicity of these models. Which strategy gives the company the ability to evaluate the LLMs with the LEAST operational overhead? A. Crowd-sourced evaluation B. Automatic model evaluation C. Model evaluation with human workers D. Reinforcement learning from human feedback (RLHF)
B Automatic model evaluation is the correct answer because it requires minimal human intervention, making it the most operationally efficient method. Crowd-sourced evaluation and model evaluation with human workers both require significant human resources and time, leading to higher operational overhead. Reinforcement learning from human feedback (RLHF) is a training method, not an evaluation method, and is even more resource-intensive than human-based evaluation.
87
A company is testing the security of a foundation model (FM). During testing, the company wants to get around the safety features and make harmful content. Which security technique is this an example of? A. Fuzzing training data to find vulnerabilities B. Denial of service (DoS) C. Penetration testing with authorization D. Jailbreak
D
88
A company is introducing a mobile app that helps users learn foreign languages. The app makes text more coherent by calling a large language model (LLM). The company collected a diverse dataset of text and supplemented the dataset with examples of more readable versions. The company wants the LLM output to resemble the provided examples. Which metric should the company use to assess whether the LLM meets these requirements? A. Value of the loss function B. Semantic robustness C. Recall-Oriented Understudy for Gisting Evaluation (ROUGE) score D. Latency of the text generation
C. Recall-Oriented Understudy for Gisting Evaluation (ROUGE) score The correct answer is C because ROUGE is specifically designed to evaluate the quality of generated text by comparing it to reference texts. Since the company wants the LLM output to be similar to the provided examples of readable text, ROUGE's ability to measure n-gram overlap, precision, recall, and F1 score makes it the most appropriate metric. Option A (Value of the loss function) is incorrect because the loss function measures the error during model training, not the quality of the final output compared to human-written examples. Option B (Semantic robustness) is too broad; while important, it doesn't directly measure the similarity to the provided examples. Option D (Latency of text generation) measures the speed of the LLM, not the quality of its output.
89
A company notices that its foundation model (FM) generates images that are unrelated to the prompts. The company wants to modify the prompt techniques to decrease unrelated images. Which solution meets these requirements? A. Use zero-shot prompts. B. Use negative prompts. C. Use positive prompts. D. Use ambiguous prompts.
B The correct answer is B because negative prompts explicitly tell the model what NOT to include in the generated image. This directly addresses the problem of unrelated image generation by filtering out undesired outputs. Option A (zero-shot prompts) doesn't directly address the issue of unrelated images; it simply refers to prompting without prior fine-tuning. Option C (positive prompts), while helpful for specifying desired elements, doesn't actively prevent unrelated content. Option D (ambiguous prompts) would likely exacerbate the problem, leading to even more irrelevant image generation.
90
A company wants to use a large language model (LLM) to generate concise, feature-specific descriptions for the company’s products. Which prompt engineering technique meets these requirements? A. Create one prompt that covers all products. Edit the responses to make the responses more specific, concise, and tailored to each product. B. Create prompts for each product category that highlight the key features. Include the desired output format and length for each prompt response. C. Include a diverse range of product features in each prompt to generate creative and unique descriptions. D. Provide detailed, product-specific prompts to ensure precise and customized descriptions.
B
91
A company is developing an ML model to predict customer churn. The model performs well on the training dataset but does not accurately predict churn for new data. Which solution will resolve this issue? A. Decrease the regularization parameter to increase model complexity. B. Increase the regularization parameter to decrease model complexity. C. Add more features to the input data. D. Train the model for more epochs.
B. Increase the regularization parameter to decrease model complexity. The model is overfitting the training data, meaning it's learned the training data too well and is not generalizing well to unseen data. Increasing the regularization parameter reduces the model's complexity, preventing it from fitting the noise in the training data and improving its ability to generalize to new, unseen data. Option A is incorrect because decreasing the regularization parameter would increase model complexity, exacerbating the overfitting problem. Option C is incorrect because adding more features might not solve the overfitting; it could even worsen it if the added features are irrelevant or noisy. Option D is incorrect because training for more epochs would likely further overfit the model to the training data.
92
A company is implementing intelligent agents to provide conversational search experiences for its customers. The company needs a database service that will support storage and queries of embeddings from a generative AI model as vectors in the database. Which AWS service will meet these requirements? A. Amazon Athena B. Amazon Aurora PostgreSQL C. Amazon Redshift D. Amazon EMR
B. Amazon Aurora PostgreSQL Amazon Aurora PostgreSQL is the correct answer because it supports the pgvector extension, which is specifically designed for storing and querying vector embeddings. Athena is a query service, not a database. Redshift is a data warehouse optimized for analytical queries, not real-time vector searches. EMR is a managed Hadoop framework and not suitable for this use case.
93
A financial institution is building an AI solution to make loan approval decisions by using a foundation model (FM). For security and audit purposes, the company needs the AI solution's decisions to be explainable. Which factor relates to the explainability of the AI solution's decisions? A. Model complexity B. Training time C. Number of hyperparameters D. Deployment time
A
94
A pharmaceutical company wants to analyze user reviews of new medications and provide a concise overview for each medication. Which solution meets these requirements? A. Create a time-series forecasting model to analyze the medication reviews by using Amazon Personalize. B. Create medication review summaries by using Amazon Bedrock large language models (LLMs). C. Create a classification model that categorizes medications into different groups by using Amazon SageMaker. D. Create medication review summaries by using Amazon Rekognition.
B
95
A company is using Retrieval Augmented Generation (RAG) with Amazon Bedrock and Stable Diffusion to generate product images based on text descriptions. The results are often random and lack specific details. The company wants to increase the specificity of the generated images. Which solution meets these requirements? A. Increase the number of generation steps. B. Use the MASK_IMAGE_BLACK mask source option. C. Increase the classifier-free guidance (CFG) scale. D. Increase the prompt strength.
C The correct answer is C because increasing the classifier-free guidance (CFG) scale directly controls how closely the generated image adheres to the text prompt. A higher CFG scale forces the model to prioritize the prompt details, leading to more specific and less random images. Option A is incorrect because increasing generation steps might improve image quality in general, but it doesn't directly address the lack of specificity tied to the prompt. Option B is incorrect as it relates to masking, which is not directly relevant to increasing specificity based on the prompt. Option D, while seemingly relevant, is less effective than adjusting the CFG scale in Stable Diffusion for controlling adherence to prompt specifics. Increasing prompt strength might help, but CFG scale offers more precise control over how closely the image matches the details in the prompt.
96
An AI practitioner wants to predict the classification of flowers based on petal length, petal width, sepal length, and sepal width. Which algorithm meets these requirements? A. K-nearest neighbors (k-NN) B. K-means C. Autoregressive Integrated Moving Average (ARIMA) D. Linear regression
A K-nearest neighbors (k-NN) is the correct answer because it is a classification algorithm suitable for predicting the class of a data point (in this case, the type of flower) based on the features (petal length, petal width, sepal length, and sepal width). K-means is a clustering algorithm, not a classification algorithm. ARIMA is used for time series forecasting and is not applicable here. Linear regression is used for predicting continuous values, not classifying discrete categories.
97
A company wants to build a lead prioritization application for its employees to contact potential customers. The application must give employees the ability to view and adjust the weights assigned to different variables in the model based on domain knowledge and expertise. Which ML model type meets these requirements? A. Logistic regression model B. Deep learning model built on principal components C. K-nearest neighbors (k-NN) model D. Neural network
A. Logistic regression model Logistic regression is the best choice because it allows for easy interpretation and adjustment of the weights assigned to variables. Employees can view the influence of each factor in the model and manually adjust them according to domain knowledge, making lead prioritization more transparent and customizable. Deep learning models (including those built on principal components) and neural networks are generally considered "black boxes," making it difficult to understand and adjust individual variable weights. K-nearest neighbors does not use weights in the same way and is not easily interpretable in terms of feature importance.
98
Which phase of the ML lifecycle determines compliance and regulatory requirements? A. Feature engineering B. Model training C. Data collection D. Business goal identification
D The correct answer is D, Business goal identification. While data collection (C) certainly *must* adhere to compliance regulations, the initial determination of which regulations apply happens during the definition of business goals. The business goals define the scope of the project, including the types of data that will be used and therefore which laws and regulations must be adhered to. Feature engineering (A) and model training (B) occur after the compliance requirements are already defined.
99
A company wants to develop an AI application to help its employees check open customer claims, identify details for a specific claim, and access documents for a claim. Which solution meets these requirements? A. Use Agents for Amazon Bedrock with Amazon Fraud Detector to build the application. B. Use Agents for Amazon Bedrock with Amazon Bedrock knowledge bases to build the application. C. Use Amazon Personalize with Amazon Bedrock knowledge bases to build the application. D. Use Amazon SageMaker to build the application by training a new ML model.
B
100
Which technique can a company use to lower bias and toxicity in generative AI applications during the post-processing ML lifecycle? A. Human-in-the-loop B. Data augmentation C. Feature engineering D. Adversarial training
A The correct answer is A, Human-in-the-loop. This technique involves human review and intervention after the AI model has generated content, allowing for the identification and mitigation of bias and toxicity in the output. Options B, C, and D are all techniques used during the *training* phase of the ML lifecycle, not the post-processing phase. Data augmentation modifies the training data, feature engineering focuses on selecting and transforming input features, and adversarial training improves model robustness against adversarial attacks. None of these directly address bias and toxicity in generated content *after* it has been produced.
101
Which prompting technique can protect against prompt injection attacks? A. Adversarial prompting B. Zero-shot prompting C. Least-to-most prompting D. Chain-of-thought prompting
A Adversarial prompting is the correct answer because it is specifically designed to protect against prompt injection attacks. Adversarial prompting uses carefully crafted prompts to make it harder for the model to be manipulated by malicious inputs. Zero-shot prompting, least-to-most prompting, and chain-of-thought prompting are not designed to directly mitigate prompt injection attacks. While chain-of-thought prompting may indirectly reduce the likelihood of some attacks by encouraging more careful reasoning, it is not a primary defense against prompt injection.
102
A company has fine-tuned a large language model (LLM) to answer questions for a help desk. The company wants to determine if the fine-tuning has enhanced the model's accuracy. Which metric should the company use for the evaluation? A. Precision B. Time to first token C. F1 score D. Word error rate
C. F1 score The F1 score is the most appropriate metric because it considers both precision (the proportion of correctly identified positive results) and recall (the proportion of actual positive results correctly identified). A high F1 score indicates a good balance between correctly identifying relevant information and avoiding false positives and negatives, which is crucial for a helpful help desk chatbot. Option A (Precision) only considers the proportion of correctly identified positive results, ignoring the possibility of missing relevant information (low recall). Option B (Time to first token) measures response speed, not accuracy. Option D (Word error rate) is used for evaluating speech recognition systems, not the accuracy of text-based responses.
103
A company has developed a generative text summarization model using Amazon Bedrock. They will use Amazon Bedrock's automatic model evaluation capabilities. Which metric should the company use to evaluate the accuracy of the model? A. Area Under the ROC Curve (AUC) score B. F1 score C. BERTScore D. Real world knowledge (RWK) score
C
104
Which prompting attack directly exposes the configured behavior of a large language model (LLM)? A. Prompted persona switches B. Exploiting friendliness and trust C. Ignoring the prompt template D. Extracting the prompt template
D The correct answer is D, Extracting the prompt template. This attack directly reveals the internal instructions (the prompt) used to configure the LLM's behavior. This prompt contains rules, constraints, and settings that influence how the model responds. Discovering this information allows attackers to manipulate the model's responses or exploit vulnerabilities in its configuration. Options A, B, and C are other types of prompting attacks, but they do not directly expose the LLM's internal configuration in the same way as prompt extraction.
105
A company wants to use Amazon Bedrock. The company needs to review which security aspects the company is responsible for when using Amazon Bedrock. Which security aspect will the company be responsible for? A. Patching and updating the versions of Amazon Bedrock B. Protecting the infrastructure that hosts Amazon Bedrock C. Securing the company's data in transit and at rest D. Provisioning Amazon Bedrock within the company network
C
106
A company needs to use Amazon SageMaker for model training and inference. The company must comply with regulatory requirements to run SageMaker jobs in an isolated environment without internet access. Which solution will meet these requirements? A. Run SageMaker training and inference by using SageMaker Experiments. B. Run SageMaker training and Inference by using network Isolation. C. Encrypt the data at rest by using encryption for SageMaker geospatial capabilities. D. Associate appropriate AWS Identity and Access Management (IAM) roles with the SageMaker jobs.
B The correct answer is B because network isolation in Amazon SageMaker allows training and inference jobs to run within a Virtual Private Cloud (VPC) without internet access. This ensures compliance with regulatory requirements for isolated environments. Option A is incorrect because SageMaker Experiments is for tracking and comparing different training runs, not for network isolation. Option C is incorrect because data encryption at rest is a security measure but does not address the requirement for network isolation and internet access restriction. Option D is incorrect because IAM roles manage access control, not network isolation.
107
An ML research team develops custom ML models. The model artifacts are shared with other teams for integration into products and services. The ML team retains the model training code and data. The ML team wants to build a mechanism that the ML team can use to audit models. Which solution should the ML team use when publishing the custom ML models? A. Create documents with the relevant information. Store the documents in Amazon S3. B. Use AWS AI Service Cards for transparency and understanding models. C. Create Amazon SageMaker Model Cards with intended uses and training and inference details. D. Create model training scripts. Commit the model training scripts to a Git repository.
C Amazon SageMaker Model Cards are specifically designed for documenting machine learning models. They provide a standardized format for capturing essential information about a model's intended use, training process, evaluation metrics, and other relevant details. This structured approach facilitates auditing and ensures transparency throughout the model lifecycle. Option A is insufficient because while storing documents in S3 provides storage, it lacks the structure and standardization needed for effective model auditing. Option B is incorrect because AWS AI Service Cards are designed for pre-trained models offered by AWS, not for custom models developed by the ML team. Option D focuses on the training scripts, which are important but do not encompass the comprehensive information required for a thorough model audit; a model audit requires information about the model's performance and intended uses beyond just the training code.
108
A software company builds tools for customers. The company wants to use AI to increase software development productivity. Which solution will meet these requirements? A. Use a binary classification model to generate code reviews. B. Install code recommendation software in the company's developer tools. C. Install a code forecasting tool to predict potential code issues. D. Use a natural language processing (NLP) tool to generate code.
D While B is a reasonable approach, D is the best answer because it directly addresses increased productivity through AI-powered code generation. Option A uses AI but focuses on code review, not generation. Option C predicts issues but doesn't directly increase productivity. Option B, while helpful, may not leverage advanced AI and could involve less sophisticated methods. Natural Language Processing (NLP) tools offer the most direct route to enhanced productivity by automating code creation.
109
A retail store wants to predict the demand for a specific product for the next few weeks using the Amazon SageMaker DeepAR forecasting algorithm. Which type of data will meet this requirement? A. Text data B. Image data C. Time series data D. Binary data
C. Time series data The correct answer is C because the Amazon SageMaker DeepAR algorithm is specifically designed for time series forecasting. Time series data, characterized by observations collected over time at regular intervals (e.g., daily sales figures), is essential for DeepAR to identify historical patterns and predict future demand. Options A, B, and D are incorrect because text data, image data, and binary data are not suitable for forecasting future values based on temporal trends, which is the core function of DeepAR.
110
A large retail bank wants to develop an ML system to help the risk management team decide on loan allocations for different demographics. What must the bank do to develop an unbiased ML model? A. Reduce the size of the training dataset. B. Ensure that the ML model predictions are consistent with historical results. C. Create a different ML model for each demographic group. D. Measure class imbalance on the training dataset. Adapt the training process accordingly.
D
111
A company wants to implement a large language model (LLM) based chatbot to provide customer service agents with real-time contextual responses to customers' inquiries. The company will use the company's policies as the knowledge base. Which solution will meet these requirements MOST cost-effectively? A. Retrain the LLM on the company policy data. B. Fine-tune the LLM on the company policy data. C. Implement Retrieval Augmented Generation (RAG) for in-context responses. D. Use pre-training and data augmentation on the company policy data.
C Retraining (A) and fine-tuning (B) an LLM on a large dataset like company policies are computationally expensive and time-consuming. Pre-training and data augmentation (D) are relevant to initial LLM development, not adapting an existing one for a specific knowledge base. Retrieval Augmented Generation (RAG) (C) is the most cost-effective solution because it leverages a pre-trained LLM and only requires retrieving relevant information from the policy database at query time, avoiding the resource-intensive process of retraining or fine-tuning the entire model.
112
A company wants to create a new solution by using AWS Glue, but has minimal programming experience with AWS Glue. Which AWS service can help the company use AWS Glue? A. Amazon Q Developer B. AWS Config C. Amazon Personalize D. Amazon Comprehend
A Amazon Q Developer is the correct answer because it provides natural language assistance and tools to help users build and deploy machine learning models, including assistance with AWS Glue tasks such as data integration, transformation, and automation. This simplifies the use of AWS Glue for users with minimal programming experience. AWS Config (B) is incorrect; it's a service for managing and assessing the configurations of AWS resources, not for directly assisting with AWS Glue development. Amazon Personalize (C) is incorrect as it's a service for building personalized recommendations, unrelated to simplifying AWS Glue usage. Amazon Comprehend (D) is also incorrect; it's a service for natural language processing, not for assisting with AWS Glue development or simplifying its usage.
113
A company is developing a mobile ML app that uses a phone's camera to diagnose and treat insect bites. The company wants to train an image classification model by using a diverse dataset of insect bite photos from different genders, ethnicities, and geographic locations around the world. Which principle of responsible AI does this company demonstrate in this scenario? A. Fairness B. Explainability C. Governance D. Transparency
A. Fairness The correct answer is A because using a diverse dataset mitigates bias and ensures the model performs equally well across different demographic groups. This directly addresses the principle of fairness in AI. Option B, Explainability, refers to the ability to understand how a model arrives at its predictions. Option C, Governance, relates to the policies and procedures for managing AI development and deployment. Option D, Transparency, focuses on the openness and clarity of the AI system's operations. None of these options are directly demonstrated by the company's use of a diverse dataset.
114
A company is developing an ML model to make loan approvals. The company must implement a solution to detect bias in the model and explain the model's predictions. Which solution will meet these requirements? A. Amazon SageMaker Clarify B. Amazon SageMaker Data Wrangler C. Amazon SageMaker Model Cards D. AWS AI Service Cards
A. Amazon SageMaker Clarify Amazon SageMaker Clarify is the correct answer because it provides both bias detection and model explainability features. The other options are incorrect: SageMaker Data Wrangler is for data preparation, not bias detection or explanation; SageMaker Model Cards are for documenting models, not directly detecting bias or explaining predictions; and AWS AI Service Cards are not a specific solution for bias detection or model explainability within the SageMaker ecosystem.
115
A company is using custom models in Amazon Bedrock for a generative AI application. The company wants to use a company-managed encryption key to encrypt the model artifacts that the model customization jobs create. Which AWS service meets these requirements? A. AWS Key Management Service (AWS KMS) B. Amazon Inspector C. Amazon Macie D. AWS Secrets Manager
A. AWS Key Management Service (AWS KMS) AWS KMS is the correct answer because it allows the management and use of customer-managed encryption keys (CMKs) to encrypt data, including model artifacts created during Amazon Bedrock model customization jobs. Amazon Inspector is a vulnerability management service, Amazon Macie is a data security and privacy service, and AWS Secrets Manager is for managing secrets, not encryption keys for data at rest. These services do not directly address the requirement of using a company-managed encryption key for model artifact encryption.
116
A company wants to use large language models (LLMs) to produce code from natural language code comments. Which LLM feature meets these requirements? A. Text summarization B. Text generation C. Text completion D. Text classification
B. Text generation The correct answer is B because generating code from natural language comments requires the LLM to create new text (the code) based on the input (the comments). Text generation is specifically designed for this task of producing human-like text, which includes code. Option A (Text summarization) is incorrect because it focuses on shortening existing text, not creating new code. Option C (Text completion) is partially correct as it can be used in this context, but text generation is a more encompassing and accurate description of the core functionality needed. Option D (Text classification) is incorrect because it involves categorizing text, not generating it. While text completion might be *used* in a text generation process, text generation itself is the higher-level feature directly addressing the core problem.
117
Which strategy will determine if a foundation model (FM) effectively meets business objectives? A. Evaluate the model's performance on benchmark datasets. B. Analyze the model's architecture and hyperparameters. C. Assess the model's alignment with specific use cases. D. Measure the computational resources required for model deployment.
C The correct answer is C because assessing the model's alignment with specific use cases directly addresses whether the model fulfills the business's needs and goals. Options A, B, and D are insufficient on their own. Benchmark datasets (A) may not reflect real-world business scenarios. Analyzing architecture and hyperparameters (B) provides technical insights but doesn't guarantee alignment with business objectives. Measuring computational resources (D) is important for deployment but doesn't evaluate the model's effectiveness in meeting business goals.
118
A company needs to train a machine learning (ML) model to classify images of different types of animals. The company has a large dataset of labeled images and will not label more data. Which type of learning should the company use to train the model? A. Supervised learning B. Unsupervised learning C. Reinforcement learning D. Active learning
A. Supervised learning Supervised learning is the correct answer because it uses labeled data, which the company already possesses. The model learns to map the input images (animals) to their corresponding labels (animal types). Unsupervised learning doesn't use labeled data, making it unsuitable. Reinforcement learning involves training an agent through trial and error, which isn't applicable here. Active learning focuses on selectively labeling data, but the company is not going to label more data.
119
A food service company wants to develop an ML model to help decrease daily food waste and increase sales revenue. The company needs to continuously improve the model's accuracy. Which solution meets these requirements? A. Use Amazon SageMaker and iterate with newer data. B. Use Amazon Personalize and iterate with historical data. C. Use Amazon CloudWatch to analyze customer orders. D. Use Amazon Rekognition to optimize the model.
A
120
A company has developed an ML model to predict real estate sale prices. The company wants to deploy the model to make predictions without managing servers or infrastructure. Which solution meets these requirements? A. Deploy the model on an Amazon EC2 instance. B. Deploy the model on an Amazon Elastic Kubernetes Service (Amazon EKS) cluster. C. Deploy the model by using Amazon CloudFront with an Amazon S3 integration. D. Deploy the model by using an Amazon SageMaker endpoint.
D Amazon SageMaker is a fully managed service for deploying machine learning models. Using an Amazon SageMaker endpoint allows for predictions without the need to manage servers or infrastructure because SageMaker handles resource provisioning, scaling, and maintenance. Options A, B, and C all require some level of server or infrastructure management. EC2 instances require direct server management. EKS requires managing a Kubernetes cluster. CloudFront and S3 are storage and delivery services; they don't inherently provide the compute necessary for model inference.
121
A manufacturing company uses AI to inspect products and find any damages or defects. Which type of AI application is the company using? A. Recommendation system B. Natural language processing (NLP) C. Computer vision D. Image processing
C. Computer vision Computer vision is the correct answer because it focuses on enabling computers to "see" and interpret images, making it ideal for inspecting products for defects. A recommendation system (A) is used for suggesting items, NLP (B) deals with text and language, and while image processing (D) is a component of computer vision, it is not the overarching AI application type being described.
122
A company wants to create an ML model to predict customer satisfaction. The company needs fully automated model tuning. Which AWS service meets these requirements? A. Amazon Personalize B. Amazon SageMaker C. Amazon Athena D. Amazon Comprehend
B. Amazon SageMaker Amazon SageMaker offers automated model tuning through features like SageMaker Autopilot and SageMaker Hyperparameter Optimization. Amazon Personalize is for personalized recommendations, Amazon Athena is for querying data, and Amazon Comprehend is for natural language processing; none of these directly address automated model tuning for a predictive model.
123
A bank has fine-tuned a large language model (LLM) to expedite the loan approval process. During an external audit of the model, the company discovered that the model was approving loans at a faster pace for a specific demographic than for other demographics. How should the bank fix this issue MOST cost-effectively? A. Include more diverse training data. Fine-tune the model again by using the new data. B. Use Retrieval Augmented Generation (RAG) with the fine-tuned model. C. Use AWS Trusted Advisor checks to eliminate bias. D. Pre-train a new LLM with more diverse training data.
A
124
A company needs to log all requests made to its Amazon Bedrock API. The company must retain the logs securely for 5 years at the lowest possible cost. Which combination of AWS service and storage class meets these requirements? (Choose two.) A. AWS CloudTrail B. Amazon CloudWatch C. AWS Audit Manager D. Amazon S3 Intelligent-Tiering E. Amazon S3 Standard
A and D AWS CloudTrail (A) is the correct choice because it is designed to log API calls made to various AWS services, including Amazon Bedrock. It provides a detailed audit trail of these actions. Amazon S3 Intelligent-Tiering (D) is the correct choice because it offers cost-effective long-term storage. It automatically moves data between access tiers based on usage patterns, optimizing costs for infrequently accessed logs that need to be retained for five years. Amazon CloudWatch (B) is incorrect because it is primarily for monitoring metrics and logs from applications and resources, not specifically for API call logging at the level of detail required. AWS Audit Manager (C) is incorrect because it is a compliance management service, not a logging service. It does not directly store logs. Amazon S3 Standard (E) is incorrect because, while it provides secure storage, it is significantly more expensive than Intelligent-Tiering for long-term storage of infrequently accessed data.
125
An ecommerce company wants to improve search engine recommendations by customizing the results for each user of the company’s ecommerce platform. Which AWS service meets these requirements? A. Amazon Personalize B. Amazon Kendra C. Amazon Rekognition D. Amazon Transcribe
A. Amazon Personalize Amazon Personalize is a fully managed machine learning service designed to create personalized recommendations. It uses user behavior and item metadata to re-rank search results, directly addressing the ecommerce company's need for customized search engine recommendations. B. Amazon Kendra is a search service, but it doesn't inherently personalize results based on individual user behavior. C. Amazon Rekognition is an image and video analysis service, irrelevant to this scenario. D. Amazon Transcribe is a speech-to-text service, also unrelated to personalized search recommendations.
126
A hospital is developing an AI system to assist doctors in diagnosing diseases using patient records and medical images. To comply with regulations, the sensitive patient data must not leave the country where the data originates. Which data governance strategy will best ensure compliance and protect patient privacy? A. Data residency B. Data quality C. Data discoverability D. Data enrichment
A Data residency is the correct answer because it ensures that data remains within specified geographical boundaries, complying with data sovereignty and privacy regulations like HIPAA or GDPR. The question specifically states that the data must remain in the country of origin, and data residency directly addresses this requirement. Option B, data quality, focuses on the accuracy and completeness of data, which is important but doesn't directly address the geographical location requirement. Option C, data discoverability, refers to the ease of finding and accessing data, and Option D, data enrichment, involves enhancing data with additional information; neither of these options addresses the data location requirement.
127
A company needs to monitor the performance of its ML systems by using a highly scalable AWS service. Which AWS service meets these requirements? A. Amazon CloudWatch B. AWS CloudTrail C. AWS Trusted Advisor D. AWS Config
A. Amazon CloudWatch Amazon CloudWatch is the correct answer because it is a highly scalable monitoring service specifically designed to track and provide performance metrics for AWS resources, including machine learning systems. It offers real-time monitoring and alerting capabilities, making it suitable for performance monitoring needs. AWS CloudTrail (B) is for auditing and security logging, not performance monitoring. AWS Trusted Advisor (C) provides recommendations for best practices, not direct performance monitoring. AWS Config (D) monitors and manages the configuration of AWS resources, but not their performance.
128
A company wants to keep its foundation model (FM) relevant by using the most recent data. The company wants to implement a model training strategy that includes regular updates to the FM. Which solution meets these requirements? A. Batch learning B. Continuous pre-training C. Static training D. Latent training
B. Continuous pre-training Continuous pre-training is the correct answer because it involves regularly updating the foundation model with new data, ensuring the model stays relevant and accurate over time. Batch learning trains the model on a fixed dataset and doesn't incorporate new data regularly. Static training implies no updates after initial training. Latent training is not a standard model training approach in this context.
129
An AI practitioner is developing a prompt for an Amazon Titan model hosted on Amazon Bedrock to solve numerical reasoning challenges. The practitioner adds the phrase: “Ask the model to show its work by explaining its reasoning step by step.” Which prompt engineering technique is the AI practitioner using? A. Chain-of-thought prompting B. Prompt injection C. Few-shot prompting D. Prompt templating
A. Chain-of-thought prompting
130
Which AWS service makes foundation models (FMs) available to help users build and scale generative AI applications? A. Amazon Q Developer B. Amazon Bedrock C. Amazon Kendra D. Amazon Comprehend
B. Amazon Bedrock Amazon Bedrock is the correct answer because it is a fully managed service explicitly designed to provide access to foundation models (FMs) from various AI companies. This allows users to build and scale generative AI applications easily. Option A, Amazon Q Developer, focuses on helping build analytics, AI/ML, and generative AI applications but is not primarily focused on providing access to foundation models in the same way Bedrock does. Options C and D, Amazon Kendra and Amazon Comprehend, are services focused on specific AI tasks (search and natural language processing respectively) and don't offer the broad access to foundation models that Bedrock provides.
131
A company is building a mobile app for users who have a visual impairment. The app must be able to hear what users say and provide voice responses. Which solution will meet these requirements? A. Use a deep learning neural network to perform speech recognition. B. Build ML models to search for patterns in numeric data. C. Use generative AI summarization to generate human-like text. D. Build custom models for image classification and recognition.
A
132
A company wants to enhance response quality for a large language model (LLM) for complex problem-solving tasks that require detailed reasoning and a step-by-step explanation process. Which prompt engineering technique best meets these requirements? A. Few-shot prompting B. Zero-shot prompting C. Directional stimulus prompting D. Chain-of-thought prompting
D. Chain-of-thought prompting Chain-of-thought prompting is the correct answer because it explicitly encourages the LLM to break down complex problems into smaller, manageable steps and explain its reasoning process at each stage. This directly addresses the requirement for detailed reasoning and step-by-step explanations. Few-shot prompting (A) provides a few examples to the LLM, but doesn't inherently force a step-by-step explanation. Zero-shot prompting (B) provides no examples, relying solely on the prompt's instructions, making detailed reasoning less likely. Directional stimulus prompting (C) is not a standard prompt engineering technique.
133
A company wants to develop ML applications to improve business operations and efficiency. Select the correct ML paradigm from the following list for each use case. Each ML paradigm should be selected one or more times. [Image](https://img.examtopics.com/aws-certified-ai-practitioner-aif-c01/image5.png)
[Image](https://img.examtopics.com/aws-certified-ai-practitioner-aif-c01/image6.png)
134
Which option is a characteristic of AI governance frameworks for building trust and deploying human-centered AI technologies? A. Expanding initiatives across business units to create long-term business value B. Ensuring alignment with business standards, revenue goals, and stakeholder expectations C. Overcoming challenges to drive business transformation and growth D. Developing policies and guidelines for data, transparency, responsible AI, and compliance
D
135
A company wants to find groups for its customers based on the customers’ demographics and buying patterns. Which algorithm should the company use to meet this requirement? A. K-nearest neighbors (k-NN) B. K-means C. Decision tree D. Support vector machine
B. K-means K-means is a clustering algorithm that groups data points based on similarity. This directly addresses the company's need to group customers based on their demographics and buying patterns. The other options are incorrect because: K-nearest neighbors is a classification algorithm; decision trees and support vector machines are used for classification or regression, not clustering.
136
A company’s large language model (LLM) is experiencing hallucinations. How can the company decrease hallucinations? A. Set up Agents for Amazon Bedrock to supervise the model training. B. Use data pre-processing and remove any data that causes hallucinations. C. Decrease the temperature inference parameter for the model. D. Use a foundation model (FM) that is trained to not hallucinate.
C
137
A company is using a large language model (LLM) on Amazon Bedrock to build a chatbot that processes customer support requests. To resolve a request, the customer and the chatbot must interact a few times. Which solution gives the LLM the ability to use content from previous customer messages? A. Turn on model invocation logging to collect messages. B. Add messages to the model prompt. C. Use Amazon Personalize to save conversation history. D. Use Provisioned Throughput for the LLMs.
B. Add messages to the model prompt. The correct answer is B because LLMs process information provided in the prompt. Including previous messages in the prompt gives the LLM the context needed to understand the conversation history and respond appropriately. Option A is incorrect because logging messages doesn't directly provide the context to the LLM during processing. Option C is incorrect because Amazon Personalize is a recommendation engine, not a tool for managing LLM context within a conversation. Option D is incorrect because Provisioned Throughput manages the LLM's processing speed, not its access to past conversation data.
138
A company's employees provide product descriptions and recommendations to customers who call the customer service center. These recommendations are based on the customers' locations. The company wants to use foundation models (FMs) to automate this process. Which AWS service meets these requirements? A. Amazon Macie B. Amazon Transcribe C. Amazon Bedrock D. Amazon Textract
C. Amazon Bedrock Amazon Bedrock is the correct answer because it's a fully managed service providing access to various foundation models (FMs). These FMs can be used to build and scale generative AI applications, such as automating product recommendations and descriptions based on customer location. A. Amazon Macie is incorrect; it's a data security and privacy service, not relevant to generating product recommendations. B. Amazon Transcribe is incorrect; it's a speech-to-text service, not designed for generating product recommendations. D. Amazon Textract is incorrect; it's an optical character recognition (OCR) service, not suitable for this task.
139
A company wants to upload customer service email messages to Amazon S3 to develop a business analysis application. The messages sometimes contain sensitive data. The company wants to receive an alert every time sensitive information is found. Which solution fully automates the sensitive information detection process with the LEAST development effort? A. Configure Amazon Macie to detect sensitive information in the documents that are uploaded to Amazon S3. B. Use Amazon SageMaker endpoints to deploy a large language model (LLM) to redact sensitive data. C. Develop multiple regex patterns to detect sensitive data. Expose the regex patterns on an Amazon SageMaker notebook. D. Ask the customers to avoid sharing sensitive information in their email messages.
A Amazon Macie is the correct answer because it is a fully managed service specifically designed for automated sensitive data detection in Amazon S3. It requires minimal development effort as it's a pre-built solution. Option B is incorrect because deploying and managing a custom LLM through SageMaker requires significant development effort. While LLMs are powerful, they are overkill and less efficient for this specific problem. Option C is incorrect because developing and maintaining multiple regex patterns is time-consuming and requires considerable development effort. It's also prone to errors and may not catch all instances of sensitive data. Option D is incorrect because it doesn't solve the problem; it relies on user behavior which is unreliable and doesn't provide automated detection.
140
An ecommerce company is using a generative AI chatbot to respond to customer inquiries. The company wants to measure the financial effect of the chatbot on the company’s operations. Which metric should the company use? A. Number of customer inquiries handled B. Cost of training AI models C. Cost for each customer conversation D. Average handled time (AHT)
C
141
A company wants to build an ML application. Select and order the correct steps from the following list to develop a well-architected ML workload. Each step should be selected one time. [Image](https://img.examtopics.com/aws-certified-ai-practitioner-aif-c01/image1.png)
The correct order is: 1. Define business goal and frame ML problem 2. Develop a model 3. Deploy a model 4. Monitor model This is the correct order because it follows the standard ML development lifecycle. First, you must define the business problem you're trying to solve with ML and frame it as an ML problem. Then, you develop and train a model. Next, you deploy the model to make predictions in a production environment. Finally, you monitor the model's performance and make adjustments as needed. Any other order would be inefficient or incomplete.
142
A company has developed a large language model (LLM) and wants to make the LLM available to multiple internal teams. The company needs to select the appropriate inference mode for each team. Select the correct inference mode from the following list for each use case. Each inference mode should be selected one or more times. [Image](https://img.examtopics.com/aws-certified-ai-practitioner-aif-c01/image3.png)
The correct answer is shown in [Image](https://img.examtopics.com/aws-certified-ai-practitioner-aif-c01/image4.png). Scenario 1 (Chatbot) requires real-time inference because it needs immediate predictions to interpret user intent. Scenario 2 (Data Processing Job) uses batch transform because it processes large volumes of data during specific periods and doesn't require immediate results.
143
A company is training its employees on how to structure prompts for foundation models. Match each prompt template to the correct prompt engineering technique. Each technique should be used only once. [Image](https://img.examtopics.com/aws-certified-ai-practitioner-aif-c01/image7.png)
The correct answer matches the image provided as the "Suggested Answer" ([Image](https://img.examtopics.com/aws-certified-ai-practitioner-aif-c01/image8.png)). This is because: * **Few-shot prompting** involves providing a few examples to the model before asking the main question. This matches template 1 in the question image. * **Zero-shot prompting** involves asking a question directly without providing any examples. This matches template 3. * **Chain-of-thought prompting** involves breaking down a complex problem into smaller, logical steps to guide the model's reasoning process. This matches template 2.
144
Which option is a benefit of using Amazon SageMaker Model Cards to document AI models? A. Providing a visually appealing summary of a model’s capabilities. B. Standardizing information about a model’s purpose, performance, and limitations. C. Reducing the overall computational requirements of a model. D. Physically storing models for archival purposes.
B. Standardizing information about a model’s purpose, performance, and limitations. Amazon SageMaker Model Cards are designed to provide standardized and detailed documentation about AI models. They help clearly communicate a model's purpose, performance, and limitations, promoting transparency and ethical AI use. Option A is incorrect because while Model Cards might be visually appealing, their primary benefit is standardization, not visual appeal. Option C is incorrect because Model Cards do not affect a model's computational requirements. Option D is incorrect because Model Cards are for documentation, not physical storage of the model itself.
145
What does an F1 score measure in the context of foundation model (FM) performance? A. Model precision and recall B. Model speed in generating responses C. Financial cost of operating the model D. Energy efficiency of the model’s computations
A The F1 score is a measure of a model's precision and recall. Options B, C, and D are incorrect because the F1 score does not directly measure model speed, financial cost, or energy efficiency. The F1 score focuses solely on the accuracy of a model's predictions, balancing the number of true positives against false positives and false negatives.
146
A company deployed an AI/ML solution to help customer service agents respond to frequently asked questions. These questions can change over time. The company wants to give customer service agents the ability to ask questions and receive automatically generated answers to common customer questions. Which strategy will meet these requirements MOST cost-effectively? A. Fine-tune the model regularly. B. Train the model by using context data. C. Pre-train and benchmark the model by using context data. D. Use Retrieval Augmented Generation (RAG) with prompt engineering techniques.
D The best answer is D because Retrieval Augmented Generation (RAG) with prompt engineering is the most cost-effective solution for handling frequently changing customer questions. RAG combines a language generation model with an information retrieval system. This allows the system to generate answers based on up-to-date data and questions without requiring continuous model fine-tuning or retraining, making it more cost-effective than options A, B, and C. Options A, B, and C would necessitate ongoing and potentially expensive retraining or fine-tuning as questions evolve, whereas RAG adapts more readily to changing information.
147
A company built an AI-powered resume screening system trained on a large dataset of resumes that was not representative of all demographics. Which core dimension of responsible AI does this scenario primarily present a challenge to? A. Fairness B. Explainability C. Privacy and security D. Transparency
A. Fairness The scenario highlights a fairness issue. A biased training dataset leads to an AI system that may discriminate against certain demographic groups, thus violating the principle of fairness in AI. Explainability (B) refers to understanding how the AI arrives at its decisions; while relevant, it's secondary to the core issue of unfair outcomes. Privacy and security (C) are not directly implicated in the description. Transparency (D), while related to responsible AI, is less central than the ethical concern of potential bias and unfair treatment caused by the skewed dataset.
148
A global financial company has developed an ML application to analyze stock market data and provide stock market trends. The company wants to continuously monitor the application development phases and ensure that company policies and industry regulations are followed. Which AWS services will help the company assess compliance requirements? (Choose two.) A. AWS Audit Manager B. AWS Config C. Amazon Inspector D. Amazon CloudWatch E. AWS CloudTrail
A, B AWS Audit Manager helps in continuous audit management by automating the collection of evidence related to compliance with company policies and industry regulations. AWS Config allows continuous monitoring and assessment of resource configurations to ensure they comply with defined policies and regulations. Amazon Inspector is used for vulnerability assessment, Amazon CloudWatch for monitoring application performance, and AWS CloudTrail for logging API calls – these are not directly related to assessing compliance requirements in the same way as Audit Manager and Config.
149
A company wants to improve the accuracy of the responses from a generative AI application that uses a foundation model (FM) on Amazon Bedrock. Which solution meets these requirements MOST cost-effectively? A. Fine-tune the FM. B. Retrain the FM. C. Train a new FM. D. Use prompt engineering.
D Fine-tuning, retraining, and training a new FM are all significantly more expensive and time-consuming than prompt engineering. Prompt engineering involves crafting more effective prompts to guide the AI's response, optimizing results with minimal investment. Options A, B, and C require substantial computational resources and time, making them less cost-effective than prompt engineering for simply improving response accuracy.
150
A company wants to identify harmful language in the comments section of social media posts using a machine learning model without using labeled data for training. Which AWS service is most appropriate for this task? A. Amazon Rekognition moderation B. Amazon Comprehend toxicity detection C. Amazon SageMaker built-in algorithms D. Amazon Polly
B Amazon Comprehend toxicity detection is the most appropriate service because it uses natural language processing (NLP) to analyze text and identify harmful language such as toxic comments. It can automatically detect patterns of toxicity without requiring labeled data for training. Option A is incorrect because Amazon Rekognition is designed for image and video analysis, not text. Option C is incorrect because while SageMaker can be used to build custom models, it requires data for training, contradicting the problem statement. Option D is incorrect because Amazon Polly is a text-to-speech service and not relevant to identifying harmful language in text.
151
A media company wants to analyze viewer behavior and demographics to recommend personalized content. They want to deploy a customized ML model in their production environment and observe if the model quality drifts over time. Which AWS service or feature best meets these requirements? A. Amazon Rekognition B. Amazon SageMaker Clarify C. Amazon Comprehend D. Amazon SageMaker Model Monitor
D. Amazon SageMaker Model Monitor Amazon SageMaker Model Monitor is designed for continuous monitoring of the performance of machine learning models deployed in production. It helps identify potential quality drifts or shifts in data patterns over time, ensuring the model continues to perform as expected. This is crucial for tracking changes in viewer behavior and demographics while recommending personalized content. Options A, B, and C are incorrect because they do not directly address the need for monitoring model quality drift in a production environment. Amazon Rekognition is for image and video analysis, Amazon SageMaker Clarify focuses on model bias detection, and Amazon Comprehend is for natural language processing.
152
A company is deploying AI/ML models using AWS services. They want to offer transparency into the models’ decision-making processes and provide explanations for the model outputs. Which AWS service or feature best meets these requirements? A. Amazon SageMaker Model Cards B. Amazon Rekognition C. Amazon Comprehend D. Amazon Lex
A. Amazon SageMaker Model Cards Amazon SageMaker Model Cards are designed to promote transparency and explainability by documenting detailed information about a model's purpose, performance, limitations, and decision-making processes. They help provide clear and standardized explanations of model outputs, making them ideal for fulfilling the company's requirements. Options B, C, and D are incorrect because they are specific AI/ML services focused on particular tasks (image analysis, text analysis, and conversational AI, respectively) and do not offer the general explainability and transparency features provided by SageMaker Model Cards.
153
A manufacturing company wants to create product descriptions in multiple languages. Which AWS service will automate this task? A. Amazon Translate B. Amazon Transcribe C. Amazon Kendra D. Amazon Polly
A. Amazon Translate Amazon Translate is a machine learning-based service that provides high-quality automatic translation across multiple languages. It is ideal for quickly and efficiently generating product descriptions in different languages, allowing the company to easily reach a global audience. The other options are incorrect: Amazon Transcribe is for speech-to-text, Amazon Kendra is a search service, and Amazon Polly is for text-to-speech.
154
A company wants more customized responses to its generative AI models’ prompts. Select the correct customization methodology from the following list for each use case. Each use case should be selected one time. [Image](https://img.examtopics.com/aws-certified-ai-practitioner-aif-c01/image11.png)
The correct answer is represented by the image: [Image](https://img.examtopics.com/aws-certified-ai-practitioner-aif-c01/image12.png) The correct mapping is based on the typical applications of each customization method: * **"The models must be taught a new domain-specific task"**: Model fine-tuning is the appropriate choice. Fine-tuning adapts a pre-trained model to a new, specific task using a dataset relevant to that task. * **"A limited amount of labeled data is available and more data is needed"**: Data augmentation is the best solution. This technique artificially increases the size of a limited dataset by creating modified versions of existing data points. * **"Only unlabeled data is available"**: Continued pre-training is the correct method. This involves training the model further on a large amount of unlabeled data to improve its general understanding and performance before fine-tuning on a specific task.