MLOps w/ AWS Flashcards

(30 cards)

1
Q

What is SageMaker guardrails?

A

A service that allows you to deploy your SageMaker models with Blue/Green testing to make sure that everything is ok before scaling to full traffic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What type of deployment testing does Sagemaker guardrails use?

A

Blue/Green

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the different versions of blue/green testing?

A
  • All-at-once
  • Canary
  • Linearly increasing
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is shadow testing?

A

Where you deploy a shadow variant which takes a percentage of traffic. You manually monitor it and decide when to promote it to production

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are SageMaker production variants? When would you use it?

A

Allows you to test with different models in production. This is for when testing with old or fake data is not representative, e.g. recommendation algorithms

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How would you get your SageMaker model ready to deploy at the edge?

A

Use SageMaker Neo - compiles your code for your specified edge device and has a runtime that will run it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How does SageMaker Neo synergise with AWS IoT Greengrass?

A

AWS IoT Greengrass is the service that allows you to actually push your compiled SageMaker Neo code to the edge devices

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What tends to be more expensive in raw terms for training - CPU or GPU?

A

GPU

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What tends to be cheaper - inference or training?

A

Inference

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are 2 downsides of training on spot instances to save money?

A

You need to checkpoint on S3 to save the progress of training the model since it can be interrupted at almost any time. Can also take longer since you have to wait for the resources to become available

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do you set up auto scaling w/ SM?

A

The same as w/ EC2. You set important metrics and cooldown periods and then the number of instances scales up to match the specifications at any given time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Does auto scaling for SageMaker try to balance across AZs automatically for the endpoints?

A

Yes, but you need more than 1 instance in each endpoint

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

When would you use the serverless deployment type in SageMaker?

A

When there is uneven/unpredictable traffic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

When would you use the real-time Sagemaker deployment type?

A

For interactive workloads that need low latency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

When would you use SageMaker Jumpstart?

A

When you can solve your problem with a pre-made model and/or don’t have ML expertise/want the easiest option

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is SageMaker Inference Recommender?

A

A service that recommends the best instance type and configuration for your models through automated load testing

17
Q

What 2 recommendation types can SageMaker Inference Recommender give you?

A

Instance recommendation (takes about 45m)
Endpoint recommendation (takes 2h-ish)

18
Q

What is SageMaker Inference Pipelines?

A

The ability to chain together 2 - 15 containers which each have their own models. This allows you to create a pipeline of processing which goes through different models.

19
Q

What is SageMaker Model Monitor?

A

A service that allows you to get alerts on quality deviations on your deployed models and helps you counteract drifts and biases that could occur in your model over time

20
Q

Do you need a monitoring schedule in order to use SM Model Monitor effectively?

21
Q

What data can Model Monitor capture with regards to your endpoint?

A

The inputs and corresponding inference outputs. The inference data can be encrypted

22
Q

What is SageMaker Projects?

A

SageMaker Studio’s native MLOps solution with CI/CD. Uses SageMaker Pipelines in the backend

23
Q

What can you use to integrate an existing Kubernetes pipeline with SageMaker?

A

SageMaker Operators for Kubernetes
Components for Kubeflow Pipelines

24
Q

Per server, as a general rule, can you have more VMs or containers?

25
What is AWS Batch?
A serverless service that allows you to run batch jobs as Docker images
26
What kind of jobs can you do with AWS Batch?
Anything that can be written with a Docker image, not necessarily ETL
27
What is Github Flow?
A development approach where there are two branches to the code - the main and the feature branches
28
What is Github Flow particularly useful for?
Environments where you need to be able to release quickly, e.g. even multiple times a day
29
What is Amazon Managed Workflows for Apache Airflow used for?
To write Python code that can develop, schedule and monitor your batch workflows
30
Does Amazon Managed Workflows for Apache Airflow have to work within a VPC?
Yes