10. Scaling Models in Production Flashcards

1
Q

How do you deploy a model trained using TensorFlow?

A

A saved model containing a complete TensorFlow program and computation. You don’t need the original code to run. You can deploy the model with TensorFlow Lite, TensorFlow.js, TensorFlow Serving or TensorFlow Hub.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the two types of API endpoints are allowed by TensorFlow Serving?

A

REST and gRPC

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What does TensorFlow Serving handle?

A

Handle model serving and version management

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the steps to set up TF Serving?

A

Install TensorFlow Serving with Docker
Train and save a model with TensorFlow
Serve the model using TensorFlow Serving

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How do you manage TensorFlow Serving?

A

You can choose to use a managed TensorFlow prebuilt container on Vertex AI.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is SignatureDef in TensorFlow Serving?

A

It defines how a saved model expects its inputs and provides its outputs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How does TF Serving response to a new version of a model?

A

It automatically unloads the old model and loads the newer version.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the two types of input features that are fetched in real time to invoke the model for prediction?

A

Static and dynamic reference features

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the two use cases to set up a real-time prediction endpoint?

A

Models trained in Vertex AI using Vertex AI training, i.e., AutoML and custom models.
Model trained elsewhere, i.e., import the model to GCP.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How to improve prediction latency?

A

Pre-compute predictions in an offline batch scoring job and store them in a low-latency read data store like Memorystore or Datastore for online serving.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Compare Static and Dynamic Reference Features

A

Static:
Values don’t change in real time
Available in a data warehouse
Estimate the price of a house based on the location
Store in NoSQL

Dynamic:
Values compute on the fly
A list of aggregated values for a particular window in a certain period of time
Predict an engine failure in the next hour
Dataflow streaming pipeline and store in a low-latency database, e.g., Bigtable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the two categories of lookup keys for prediction requests?

A

Specific entity, e.g., customer id
Specific combination of input features, e.g., a hashed combination of all possible input features.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the specific filenames for importing prebuilt container?

A

TensorFlow SavedModel: saved_model.pb
scikit-learn: model.joblib or model.pkl
XGBoost: model.bst, model.joblib or model.pkl

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How do you import a custom container?

A

Create a container image
Push the image using Cloud Build to Artifact Registry.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How to set up autoscaling for your endpoint in Vertex AI?

A

Specify the autoscaling for your endpoint container. Vertex AI automatically provisions the container resources and sets up autoscaling for your endpoint.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What do you need to specify when deploy a model in Vertex AI?

A

The deploy method will create an endpoint and deploy your model.
You need to provide:
model name, traffic-split, machine-type, accelerator type, accelerator count, starting replica count, max replica count

Hints: Never Stop Making Amazing Apple Raspberry Rolls.

15
Q

Can you deploy models from Model Registry?

A

Yes, Model Registry is a centralized place to track versions of both AutoML and custom models.

16
Q

What is your input format if you use prebuilt containers or custom container to serve predictions?

A

JSON

17
Q

What is A/B testing used for?

A

A/B testing compares the performance of two versions of a model and see which one is more appealing to viewers.

18
Q

What is the strategy to replace one model with another one?

A

Add a new model to the same endpoint and gradually increase the traffic split for the new model until 100%.

19
Q

What is Vertex AI model evaluation used for?

A

Run model evaluation jobs regardless of Vertex service is used to train the model.
It will also store and visualize the evaluation results across multiple models in the Model Registry.

20
Q

What do you get from online explanation requests?

A

You get both predictions and feature attributions.

20
Q

Why do you need to undeploy models?

A

incur charges

21
Q

What are the input data options for batch training in Vertex AI?

A

JSON, TFRecord, CSV files, File list, BigQuery

Hints: Lions Jump Carefully Through Bushes.

21
Q

What output options do you have for batch prediction?

A

BigQuery table or Cloud Storage

22
Q

What are the four primary functions for MLflow?

A

MLflow tracking: Experiment tracking, record and compare parameters and results.
MLflow projects: Packaging ML code in a reusable and reproducible form
MLflow models: Manage and deploy models to different platforms
MLflow model registry: centralised model store (versioning, stage transitions and annotations)

Hints: Tomatoes Provide Marvelous Redness.

23
Q

Which Google Cloud service can help detecting skew and monitoring model performance over time?

A

Vertex AI Model Monitoring and Feature Store

23
Q

How do you run MLflow on Google Cloud?

A

Create a PostgreSQL DB for storing model metadata
Create a Cloud Storage bucket for storing artifacts
Create a Compute Engine to install MLFlow

24
Q

What do you test for target performance?

A

Training-serving skew and quality of the model with real-time data
Monitor the model age and performance
Test model weights and outputs are numerically stable

25
Q

How do you create a trigger and schedule a training or prediction job on Vertex AI?

A

Cloud Scheduler set up a cron job to schedule training or prediction job.
Vertex AI managed notebook can execute and schedule a training or prediction job
Cloud Build can retrain a model (Dockerfile) and kick off a training. Cloud Run is a managed offering to deploy containers.
Cloud Pub/Sub and Cloud Functions or Cloud Storage event-based trigger

Hints: Snakes Never Bother Porcupines.

26
Q

What is Cloud Workflows?

A

Orchestrate multiple HTTP-based services into a workflow, i.e., chain microservices together.

27
Q

What is Vertex AI Pipelines?

A

automate, monitor, govern ML system by orchestrating your ML workflow in a serverless manner and storing your artifacts using ML Metadata. You can analyze the lineage.

28
Q

What is Cloud Composer?

A

Orchestrate data-driven workflows.
Built on Apache Airflow
Fully managed
Support on-premises and multiple cloud platforms.
Direct acyclic graph