- is a workflow orchestration service within SageMaker. - supports the use of batch transforms to run inference of entire datasets.

- no-code ML interface to create models. - does not provide a separate network protection mechanism. - import existing models to SageMaker - requires the

SageMaker Flashcards by Luis Dantas

You can use SageMaker Model Registry to

create a catalog of models for production
manage the versions of a model
associate metadata to the model
manage approvals and automate model deployment for CI/CD

How well did you know this?

Not at all

Perfectly

SageMaker Experiments is

a feature of SageMaker Studio that you can use to automatically create ML experiments by using different combinations of data, algorithms, and parameters.

How well did you know this?

Not at all

Perfectly

Amazon SageMaker Model Monitor

Monitors the quality of Amazon SageMaker AI machine learning models in production. With Model Monitor, you can set up:

Continuous monitoring with a real-time endpoint.
Continuous monitoring with a batch transform job that runs regularly.
On-schedule monitoring for asynchronous batch transform jobs.

How well did you know this?

Not at all

Perfectly

Amazon SageMaker Model Monitor Data Capture

is a feature of SageMaker endpoints.
- record data that you can then use for training, debugging, and monitoring.
- use the new data that is captured by Data Capture to re-train the model.
- runs asynchronously without impacting production traffic.

How well did you know this?

Not at all

Perfectly

SageMaker Clarify

provides tools to help explain how machine learning (ML) models make predictions.
- you can use to check for bias and explainability in datasets and models.
- checks for bias by analyzing predictions after you deploy the model.

How well did you know this?

Not at all

Perfectly

TensorBoard is a capability of SageMaker that you can use to

visualize and analyze intermediate tensors during model training.
gain full visibility into the model training process, including debugging and model optimization.
debug issues, including lower than expected precision for a specific class.
analyze the intermediate activations and gradients during training.
gain insights into why some mobile phone images were getting misclassified.

How well did you know this?

Not at all

Perfectly

SageMaker Pipelines

is a workflow orchestration service within SageMaker.
supports the use of batch transforms to run inference of entire datasets.

How well did you know this?

Not at all

Perfectly

SageMaker Pipelines Batch transforms

Are the most cost-effective inference method for models that are called only on a periodic basis.

How well did you know this?

Not at all

Perfectly

SageMaker asynchronous endpoint

endpoint with a connection to your VPC.
requests in near real time for up to 60 minutes of processing time.
payloads up to 1 GB
There is no idle cost to operate an asynchronous endpoint.

How well did you know this?

Not at all

Perfectly

SageMaker real-time endpoint

can process responses only for up to 60 seconds.
model to receive responses for each request in real time.
You can configure a VPC for Amazon SageMaker real-time endpoints.

How well did you know this?

Not at all

Perfectly

SageMaker batch transform job to run inference when

you do not need a persistent endpoint
you need a VPC for SageMaker batch transform.
you do not need to return an inference for each request to the model.
require a minimum size of 100 MB for the inference request dataset.

How well did you know this?

Not at all

Perfectly

SageMaker serverless endpoint

receive responses for each request in real time.
don’t support VPC for the endpoint in this solution.
can support processing times of up to 60 seconds.

How well did you know this?

Not at all

Perfectly

SageMaker Canvas

no-code ML interface to create models.
does not provide a separate network protection mechanism.
import existing models to SageMaker
requires the

How well did you know this?

Not at all

Perfectly

SageMaker network isolation.

This solution will block internet access and external/customer’s VPC network access.

How well did you know this?

Not at all

Perfectly

SageMaker input modes

S3 File: downloads the training data from the storage location to a local directory
S3 FastFile: file system access to an Amazon S3 data source. Training can start without waiting for the entire dataset to download
S3 Pipe: streams data directly from an Amazon S3 data source
FSx: Requires Amazon Virtual Private Cloud (VPC)
EFS: Requires Amazon Virtual Private Cloud (VPC)

How well did you know this?

Not at all

Perfectly

SageMAker FrameworkProcessor provides premade containers for the following machine learning frameworks:

Study These Flashcards

Hugging Face
MXNet
PyTorch
TensorFlow
XGBoost.

SageMaker AMT

Study These Flashcards

searches for the most suitable version of a model by running training jobs based on the algorithm and objective criteria.

You can use a SageMaker AMT warm start tuning job to

Study These Flashcards

use the results from previous training jobs as a starting point.

SageMaker AMT can use early stopping to

Study These Flashcards

compare the current objective metric (accuracy) against the median of the running average of the objective metric. Then, early stopping can determine whether or not to stop the current training job.

SageMaker AMT IDENTICAL_DATA_AND_ALGORITHM setting

Study These Flashcards

assumes the same input data and training image as the previous tuning jobs

Hyperparameter tuning can

Study These Flashcards

accelerate your productivity by trying many variations of a model.

AMT MaxNumberOfTrainingJobs

Study These Flashcards

The maximum number of training jobs to be run before tuning is stopped.

AMT MaxNumberOfTrainingJobsNotImproving

Study These Flashcards

The maximum number of training jobs that do not improve performance against the objective metric from the current best training job. As an example, if the best training job returned an objective metric that had an accuracy of 90%, and MaxNumberOfTrainingJobsNotImproving is set to 10. In this example, tuning will stop after 10 training jobs fail to return an accuracy higher than 90%.

SageMaker ModelBiasMonitor class

Study These Flashcards

create a bias baseline and deploy a monitoring mechanism that evaluates whether the model bias deviates from the bias baseline.

SageMaker DefaultModelMonitor class

generate statistics and constraints around the data and to deploy a monitoring mechanism that evaluates whether data drift has occurred

SageMaker ModelExplainabilityMonitor class

generate a feature attribution SHAP baseline and to deploy a monitoring mechanism that evaluates whether the feature attribution has occurred.

SageMaker ModelQualityMonitor class

generate a model quality baseline and to deploy a monitoring mechanism that evaluates whether the model quality drift has occurred.

Random oversampling

Randomly duplicates samples in the minority category. For example, if you're trying to detect fraud, you might only have cases of fraud in 10% of your data. For an equal proportion of fraudulent and non-fraudulent cases, this operator randomly duplicates fraud cases within the dataset 8 times.

Random undersampling

Roughly equivalent to random oversampling. Randomly removes samples from the overrepresented category to get the proportion of samples that you desire.

Synthetic Minority Oversampling Technique (SMOTE)

Uses samples from the underrepresented category to interpolate new synthetic minority samples. For more information about SMOTE, see the following description.

Amazon SageMaker Neo

Automatically optimizes machine learning models for inference on cloud instances and edge devices to run faster with no loss in accuracy. Neo optimizes the trained model and compiles it into an executable.

SageMaker Debugger

- Provides tools to debug training jobs and resolve such problems to improve the performance of your model. - Helps to solve problems such as **overfitting**, **saturated activation functions**, and **vanishing gradients**

SageMaker Flashcards

(32 cards)