SageMaker Flashcards

(32 cards)

1
Q

You can use SageMaker Model Registry to

A
  • create a catalog of models for production
  • manage the versions of a model
  • associate metadata to the model
  • manage approvals and automate model deployment for CI/CD
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

SageMaker Experiments is

A

a feature of SageMaker Studio that you can use to automatically create ML experiments by using different combinations of data, algorithms, and parameters.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Amazon SageMaker Model Monitor

A

Monitors the quality of Amazon SageMaker AI machine learning models in production. With Model Monitor, you can set up:

  • Continuous monitoring with a real-time endpoint.
  • Continuous monitoring with a batch transform job that runs regularly.
  • On-schedule monitoring for asynchronous batch transform jobs.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Amazon SageMaker Model Monitor Data Capture

A

is a feature of SageMaker endpoints.
- record data that you can then use for training, debugging, and monitoring.
- use the new data that is captured by Data Capture to re-train the model.
- runs asynchronously without impacting production traffic.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

SageMaker Clarify

A

provides tools to help explain how machine learning (ML) models make predictions.
- you can use to check for bias and explainability in datasets and models.
- checks for bias by analyzing predictions after you deploy the model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

TensorBoard is a capability of SageMaker that you can use to

A
  • visualize and analyze intermediate tensors during model training.
  • gain full visibility into the model training process, including debugging and model optimization.
  • debug issues, including lower than expected precision for a specific class.
  • analyze the intermediate activations and gradients during training.
  • gain insights into why some mobile phone images were getting misclassified.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

SageMaker Pipelines

A
  • is a workflow orchestration service within SageMaker.
  • supports the use of batch transforms to run inference of entire datasets.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

SageMaker Pipelines Batch transforms

A

Are the most cost-effective inference method for models that are called only on a periodic basis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

SageMaker asynchronous endpoint

A
  • endpoint with a connection to your VPC.
  • requests in near real time for up to 60 minutes of processing time.
  • payloads up to 1 GB
  • There is no idle cost to operate an asynchronous endpoint.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

SageMaker real-time endpoint

A
  • can process responses only for up to 60 seconds.
  • model to receive responses for each request in real time.
  • You can configure a VPC for Amazon SageMaker real-time endpoints.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

SageMaker batch transform job to run inference when

A
  • you do not need a persistent endpoint
  • you need a VPC for SageMaker batch transform.
  • you do not need to return an inference for each request to the model.
  • require a minimum size of 100 MB for the inference request dataset.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

SageMaker serverless endpoint

A
  • receive responses for each request in real time.
  • don’t support VPC for the endpoint in this solution.
  • can support processing times of up to 60 seconds.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

SageMaker Canvas

A
  • no-code ML interface to create models.
  • does not provide a separate network protection mechanism.
  • import existing models to SageMaker
  • requires the
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

SageMaker network isolation.

A

This solution will block internet access and external/customer’s VPC network access.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

SageMaker input modes

A
  • S3 File: downloads the training data from the storage location to a local directory
  • S3 FastFile: file system access to an Amazon S3 data source. Training can start without waiting for the entire dataset to download
  • S3 Pipe: streams data directly from an Amazon S3 data source
  • FSx: Requires Amazon Virtual Private Cloud (VPC)
  • EFS: Requires Amazon Virtual Private Cloud (VPC)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

SageMAker FrameworkProcessor provides premade containers for the following machine learning frameworks:

A
  • Hugging Face
  • MXNet
  • PyTorch
  • TensorFlow
  • XGBoost.
17
Q

SageMaker AMT

A

searches for the most suitable version of a model by running training jobs based on the algorithm and objective criteria.

18
Q

You can use a SageMaker AMT warm start tuning job to

A

use the results from previous training jobs as a starting point.

19
Q

SageMaker AMT can use early stopping to

A

compare the current objective metric (accuracy) against the median of the running average of the objective metric. Then, early stopping can determine whether or not to stop the current training job.

20
Q

SageMaker AMT IDENTICAL_DATA_AND_ALGORITHM setting

A

assumes the same input data and training image as the previous tuning jobs

21
Q

Hyperparameter tuning can

A

accelerate your productivity by trying many variations of a model.

22
Q

AMT MaxNumberOfTrainingJobs

A

The maximum number of training jobs to be run before tuning is stopped.

23
Q

AMT MaxNumberOfTrainingJobsNotImproving

A

The maximum number of training jobs that do not improve performance against the objective metric from the current best training job. As an example, if the best training job returned an objective metric that had an accuracy of 90%, and MaxNumberOfTrainingJobsNotImproving is set to 10. In this example, tuning will stop after 10 training jobs fail to return an accuracy higher than 90%.

24
Q

SageMaker ModelBiasMonitor class

A

create a bias baseline and deploy a monitoring mechanism that evaluates whether the model bias deviates from the bias baseline.

25
SageMaker DefaultModelMonitor class
generate statistics and constraints around the data and to deploy a monitoring mechanism that evaluates whether data drift has occurred
26
SageMaker ModelExplainabilityMonitor class
generate a feature attribution SHAP baseline and to deploy a monitoring mechanism that evaluates whether the feature attribution has occurred.
27
SageMaker ModelQualityMonitor class
generate a model quality baseline and to deploy a monitoring mechanism that evaluates whether the model quality drift has occurred.
28
Random oversampling
Randomly duplicates samples in the minority category. For example, if you're trying to detect fraud, you might only have cases of fraud in 10% of your data. For an equal proportion of fraudulent and non-fraudulent cases, this operator randomly duplicates fraud cases within the dataset 8 times.
29
Random undersampling
Roughly equivalent to random oversampling. Randomly removes samples from the overrepresented category to get the proportion of samples that you desire.
30
Synthetic Minority Oversampling Technique (SMOTE)
Uses samples from the underrepresented category to interpolate new synthetic minority samples. For more information about SMOTE, see the following description.
31
Amazon SageMaker Neo
Automatically optimizes machine learning models for inference on cloud instances and edge devices to run faster with no loss in accuracy. Neo optimizes the trained model and compiles it into an executable.
32
SageMaker Debugger
- Provides tools to debug training jobs and resolve such problems to improve the performance of your model. - Helps to solve problems such as **overfitting**, **saturated activation functions**, and **vanishing gradients**