MLOps w/ AWS Flashcards
(30 cards)
What is SageMaker guardrails?
A service that allows you to deploy your SageMaker models with Blue/Green testing to make sure that everything is ok before scaling to full traffic
What type of deployment testing does Sagemaker guardrails use?
Blue/Green
What are the different versions of blue/green testing?
- All-at-once
- Canary
- Linearly increasing
What is shadow testing?
Where you deploy a shadow variant which takes a percentage of traffic. You manually monitor it and decide when to promote it to production
What are SageMaker production variants? When would you use it?
Allows you to test with different models in production. This is for when testing with old or fake data is not representative, e.g. recommendation algorithms
How would you get your SageMaker model ready to deploy at the edge?
Use SageMaker Neo - compiles your code for your specified edge device and has a runtime that will run it
How does SageMaker Neo synergise with AWS IoT Greengrass?
AWS IoT Greengrass is the service that allows you to actually push your compiled SageMaker Neo code to the edge devices
What tends to be more expensive in raw terms for training - CPU or GPU?
GPU
What tends to be cheaper - inference or training?
Inference
What are 2 downsides of training on spot instances to save money?
You need to checkpoint on S3 to save the progress of training the model since it can be interrupted at almost any time. Can also take longer since you have to wait for the resources to become available
How do you set up auto scaling w/ SM?
The same as w/ EC2. You set important metrics and cooldown periods and then the number of instances scales up to match the specifications at any given time
Does auto scaling for SageMaker try to balance across AZs automatically for the endpoints?
Yes, but you need more than 1 instance in each endpoint
When would you use the serverless deployment type in SageMaker?
When there is uneven/unpredictable traffic
When would you use the real-time Sagemaker deployment type?
For interactive workloads that need low latency
When would you use SageMaker Jumpstart?
When you can solve your problem with a pre-made model and/or don’t have ML expertise/want the easiest option
What is SageMaker Inference Recommender?
A service that recommends the best instance type and configuration for your models through automated load testing
What 2 recommendation types can SageMaker Inference Recommender give you?
Instance recommendation (takes about 45m)
Endpoint recommendation (takes 2h-ish)
What is SageMaker Inference Pipelines?
The ability to chain together 2 - 15 containers which each have their own models. This allows you to create a pipeline of processing which goes through different models.
What is SageMaker Model Monitor?
A service that allows you to get alerts on quality deviations on your deployed models and helps you counteract drifts and biases that could occur in your model over time
Do you need a monitoring schedule in order to use SM Model Monitor effectively?
Yes
What data can Model Monitor capture with regards to your endpoint?
The inputs and corresponding inference outputs. The inference data can be encrypted
What is SageMaker Projects?
SageMaker Studio’s native MLOps solution with CI/CD. Uses SageMaker Pipelines in the backend
What can you use to integrate an existing Kubernetes pipeline with SageMaker?
SageMaker Operators for Kubernetes
Components for Kubeflow Pipelines
Per server, as a general rule, can you have more VMs or containers?
Containers