Amazon SageMaker - Deep Dive Flashcards

Question

How does SageMaker Clarify compare different models?

Answer 1

By using human evaluations on specific tasks with metrics like brand voice and relevance.

Answer 2

Evaluations of friendliness, humor, and other subjective attributes of model outputs.

Answer 3

Yes, you can use an AWS-managed team or bring your own employees.

Answer 4

Understanding how and why a model makes predictions using interpretation tools.

Answer 5

To debug predictions, improve trust, and increase understanding of model behavior.

Answer 6

Explaining why a loan application was rejected by identifying the most influential features.

Answer 7

It detects and measures biases in datasets and models using statistical metrics.

Answer 8

Class imbalance or demographic representation issues like gender or age bias.

Answer 9

Data labeling, human review, model customization, and alignment using RLHF.

Answer 10

Labeling images by identifying objects like dogs, cats, or ships with human annotators.

Answer 11

Employees, third-party reviewers, or Amazon Mechanical Turk workers.

Answer 12

An enhanced version of Ground Truth that uses an expert workforce for data labeling tasks.

Answer 13

To gather essential model information like intended use, risk rating, and training details in one place.

Answer 14

A centralized view to monitor, explore, and track all SageMaker models with insights on quality, risk, and bias.

Answer 15

To define roles and permissions for different personas like data scientists or MLOps engineers.

Answer 16

To track the performance of deployed models and detect data or prediction quality deviations.

Answer 17

Continuously or on a schedule (e.g., daily, weekly).

Answer 18

Investigate and fix the issue by updating data or retraining the model.

Answer 19

A centralized model catalog for versioning, managing metadata, and controlling model approval workflows.

Answer 20

By setting an approval status and involving human reviewers before deployment.

Answer 21

A CI/CD workflow tool for automating machine learning model building, training, and deployment.

Answer 22

They automate workflows, reduce manual errors, ensure repeatability, and accelerate iteration.

Answer 23

Processing, Training, Tuning, AutoML, Model, ClarifyCheck, and QualityCheck.

Answer 24

Performs data preparation like feature engineering.

Answer 25

Trains the machine learning model on prepared data.

Answer 26

Optimizes model performance through hyperparameter tuning.

Answer 27

Automatically trains models with minimal configuration.

Answer 28

Creates and registers a SageMaker model (optionally to the Model Registry).

Answer 29

Performs bias and explainability analysis using SageMaker Clarify.

Answer 30

Checks the data or model quality against a defined baseline.

Answer 31

It allows quick insights and actions on models that violate performance or fairness thresholds.

Answer 32

It enables tracking changes across different versions of models, ensuring reproducibility and control.

Answer 33

A machine learning hub in SageMaker to find and launch pre-trained models (NLP, CV, etc.) and ML solutions quickly.

Answer 34

Pre-trained models from providers like Hugging Face, Meta, Stability AI, Databricks, and more.

Answer 35

The Machine Learning Hub (pre-trained models) and Machine Learning Solutions (pre-built use case templates).

Answer 36

You can launch them, customize with your own data, fine-tune, and deploy them on SageMaker.

Answer 37

A no-code, visual interface for building ML models using your data.

Answer 38

Non-developers or business users who want to build ML models without coding.

Answer 39

SageMaker Autopilot, which uses AutoML to build models.

Answer 40

Amazon Comprehend, Rekognition, and Textract.

Answer 41

Sentiment analysis (Comprehend), object detection (Rekognition), document analysis (Textract).

Answer 42

An open-source tool to manage the ML lifecycle: tracking experiments, models, and workflows.

Answer 43

Yes, SageMaker allows you to launch an MLFlow Tracking Server directly in SageMaker Studio.

Answer 44

To manage experiments and track training runs as part of ML model development.

Answer 45

Seamless use of open-source ML lifecycle tools within the SageMaker ecosystem.

Answer 46

A drag-and-drop, no-code visual interface to simplify the ML pipeline.

Answer 47

JumpStart helps you launch and customize pre-trained models; Canvas lets you build models visually without code.

Answer 48

An end-to-end machine learning service to build, train, and deploy ML models at scale.

Answer 49

To tune hyperparameters of ML models automatically.

Answer 50

Real-time, serverless, batch, and asynchronous inference.

Answer 51

A unified IDE for building, training, debugging, and deploying ML models end-to-end.

Answer 52

A tool to import, explore, process, and prepare data for ML.

Answer 53

A centralized repository to store and retrieve ML features across teams and pipelines.

Answer 54

Provides model comparison, bias detection, and model explainability.

Answer 55

A tool for data labeling and reinforcement learning from human feedback (RLHF).

Answer 56

A documentation tool to describe a model's intended use, risk, and training details.

Answer 57

A centralized place to track all deployed models and their performance metrics.

Answer 58

Monitors deployed models for data drift, quality issues, and sends alerts.

Answer 59

A catalog for managing, versioning, and approving ML models before deployment.

Answer 60

A CI/CD service for automating ML workflows including data prep, training, and deployment.

Answer 61

A service to manage access and permissions for different user roles in SageMaker.

Answer 62

A hub for pre-trained models and pre-built ML solutions for rapid prototyping.

Answer 63

A no-code interface to build and deploy ML models visually without programming.

Answer 64

An integration that lets you use open-source MLFlow to track experiments and manage ML lifecycles.

Answer 65

A security feature that prevents outbound network access from training/inference containers to protect data and prevent leaks.

Answer 66

Containers cannot access the internet, Amazon S3, or your VPC—only preloaded data is available for training.

Answer 67

Forecasting time series data.

Answer 68

Recurrent Neural Network (RNN).

Amazon SageMaker - Deep Dive Flashcards

(92 cards)