aws-ml-speciality Flashcards

Question 1

Q

Main phases in an ML project

Answer

A

Data engineering
EDA (Feature engineering)
Modeling
MLOps

Question 2

Q

How to interpret if a given RMSE number is good or bad?

Answer

A

Compare with the baseline model’s RMSE

The baseline model is where you always predict the mean of the target as your output

Question 3

Q

A/B testing of models

Answer

A

Ans: Deploy multi-model endpoints (also called ProductionVarinats) in AWS

Background: Multiple EC2 instances are deployed behind a single endpoint. This means, an endpoint is served by multiple EC2 instances to ensure high availability. The route request to these EC2 instances is managed by a load balancer.

Put multiple models in a container. A container is present on an EC2 instance.
When an inference request is made to the endpoint, the load balancer routes the request to one of the instances. (Each instance has an EBS volume and some memory)
The selected instance downloads the model artefact from S3 into the EBS volume and loads it into the memory. If the model is already loaded in the memory, invocation is faster because SageMaker doesn’t need to download and load it.
SageMaker continues to route requests for a model to the instance where the model is already loaded. However, if the model receives many invocation requests, and there are additional instances for the multi-model endpoint, SageMaker routes some requests to another instance to accommodate the traffic. If the model isn’t already loaded on the second instance, the model is downloaded to that instance’s storage volume and loaded into the container’s memory.

aws-ml-speciality Flashcards

(3 cards)