4. Choosing the Right ML Infrastructure Flashcards

Question 1

Q

ML models for video and images require a large number of computations for each loop of training. What mathematical calculations are involved?

Answer

A

matrix multiplications, additions, subtractions, and differentials

Question 2

Q

When you have an ML problem to solve, say image classification, you have three ways approaching it:

Answer

A

Pretrained (fastest to develop and least expertise needed)
AutoML (in between)
Custom (Most flexible but expertise needed)

Question 3

Q

Pretrained models are already deployed and can be readily used via …

Question 4

Q

What is the biggest advantage of using pretrained models?

Answer

A

ease of use and the speed

Question 5

Q

How can developers use pre-trained models?

Answer

A

Using CLI, Python, Java, or Node.js SDK.

Question 6

Q

Are pre-trained models serverless?

Question 7

Q

What is the biggest disadvantage of using pretrained models?

Answer

A

Less customizable

Question 8

Q

What can Vertex AI AutoML do for you?

Answer

A

Build your own model using your own data.

Question 9

Q

AutoML chooses the best ML algorithm, and the only thing that it needs is the data. What do you need to do?

Answer

A

Format the data and work on quality control

Question 10

Q

Unlike with pretrained models, you have to provision cloud resources for training and deploying the model on instances. What do you have to decide on?

Answer

A

Number of hours of instance time
Devices the models need to deploy on (cloud, phone, IOT)

Question 11

Q

What options do you have if pre-trained models and AutoML do not fit your need?

Answer

A

Use custom models in Vertex AI

Question 12

Q

What pre-trained models does Google have?

Answer

A

Vision AI
Video AI
Natural Language AI
Translation AI
Speech‐to‐Text
Text‐to‐Speech

Question 13

Q

What solutions does Google have in addition to pretrained models?

Answer

A

Document AI
Contact Center AI

Question 14

Q

What can Vision AI do for you?

Answer

A

perform image classification, detect objects and faces, and read handwriting (through optical character recognition)

Question 15

Q

What is the process for Vision AI to image classification?

Answer

A

Detect objects in the photo
Get a set of labels for your image (e.g., table, plant, chair)
Get the dominant colors for these images
Categorize (e.g., Adult, Spoof, Medical, Violence)

Question 16

Q

What can Video AI do?

Answer

A

Recognize objects, places, and actions in videos.

Question 17

Q

Give 3 use cases for Video AI?

Answer

A

Build a video recommendation system
Create an index of your video archives
Map advertisements to your content

Question 18

Q

What does Natural Language AI do?

Answer

A

Provides insights from unstructured text using pretrained machine learning models including entity extraction, sentiment analysis, syntax analysis, and general categorization.

Question 19

Q

What does entity extraction do?

Answer

A

Identifies entities such as the names of people, organizations, products, events, locations, and so on.

Question 20

Q

What does sentiment analysis do?

Answer

A

Provides you a positive, negative, or neutral score with magnitude for each sentence, for each entity, and the whole text.

Question 21

Q

What does syntax analysis do?

Answer

A

Identify the part of speech, dependency between words, lemma, and the morphology of text.

Question 22

Q

Give 2 use cases for Natural Language AI

Answer

A

Measure the customer sentiment toward a particular product.
Use Healthcare Natural Language API to understand details specific to healthcare text like clinical notes or healthcare research documents.

Question 23

Q

Translation AI has 2 levels, Basic and Advanced. What are the main differences?

Answer

A

Advanced version can use a glossary (a dictionary of terms mapped from source language to target language) and also can translate entire documents (PDFs, DOCs, etc.).

Question 24

Q

What do Media Translation API do?

Answer

A

Translates audio in source language into audio in target languages.

Question 25

Q

What is the use case for Speech‐to‐Text service to convert recorded audio or streaming audio into text?

Answer

A

Creating subtitles for video recordings and streaming video as well.

Question 26

Q

There is AutoML training available for many data types and use cases. We can broadly categorize them into four categories. What are the categories?

Answer

A

Structured data
Images/video
Natural language
Recommendations AI/Retail AI

Question 27

Q

What are the two methods of training models for AutoML for Tables?

Answer

A

BigQuery ML and Vertex AI Tables

Question 28

Q

How do you deploy ML models trained by Vertex AI Tables?

Answer

A

Deploy the model on an endpoint and serve predictions through REST API.

Question 29

Q

What metrics do you use for ML classification?

Answer

A

AUC ROC, AUC ROC, Logloss, Precision at Recall, Recall at Precision

Question 30

Q

What metrics do you use for ML regression?

Answer

A

RMSE, RMSLE, MAE

Question 31

Q

What metrics do you use for ML time-series forecasting?

Answer

A

RMSE, RMSLE, MAPE, Quantile loss

Question 32

Q

Why do prices for node hours differ for the different types of AutoML jobs?

Answer

A

Because the hardware used for different AutoML jobs is different.

Question 33

Q

Vertex AI offers three model training methods for forecasting. What are they?

Answer

A

AutoML
Seq2seq+ (takes in a sequence and produces another sequence)
Temporal Fusion Transformer (a deep neural network model that also uses the attention mechanism)

Question 34

Q

Summarize all the available AutoML algorithms

Answer

A

Image classification (single): Predict one correct label from a list of labels
Image multiclass classification: Predict all the correct labels
Image object detection: Predict all the locations of objects
Image segmentation: Predict per‐pixel areas of an image with a label.
Video classification: Get label predictions for entire videos, shots, and frames.
Video action recognition: Identify the action moments in video.
Video object tracking: Get labels, tracks, and time stamps for objects

Question 35

Q

What are the considerations for deploying AutoML models to edge devices (iPhones, Android phones and Edge TPU devices?

Answer

A

Less memory and low latency

Question 36

Q

What are the four popular models for AutoML for Text?

Answer

A

Text classification: Predict the one correct label
Multi-label classification: Predict all the correct labels
Entity extraction: Identify entities within your text items.
Translation: Convert text from source language to target language.

Question 37

Q

How to use Retail AI?

Answer

A

Upload the product catalog (product, photos, and other metadata)
Define “user events” (what the customer clicks, views, and buys)
Recommendations AI uses this data to create models for giving recommendations.

Question 38

Q

What does Vision API Product Search do?

Answer

A

Trained on reference images of products in your catalog, which can then be searched using an image.

Question 39

Q

Summarize the recommendation types available in Retail AI

Answer

A

Others you may like (product page, customer behavior and product relevance, click-through rate)
Frequently bought together (checkout page, shopping cart items, revenue per order)
Recommended for you (home page, user viewing history, click-through rate)
Similar items (product page, product catalog, click-through rate)

Question 40

Q

Document AI has two important concepts: processors and Document AI Warehouse. What are they used for?

Answer

A

Document AI processor is an interface and do general processing, specialized processing (procurement, identity, lending, and contract documents), and custom processing (provide your own labeled set of documents).
Document AI Warehouse is a platform to store, search, organize, govern, and analyze documents along with their structured metadata.

Question 41

Q

What can Document AI do?

Answer

A

Detect document quality
Deskew
Extract text and layout information
Identify and extract key/value pairs
Extract and normalize entities
Split and classify documents
Review documents (human in the loop)
Store, search, and organize documents (Document AI Warehouse)

Question 42

Q

What does Dialogflow do?

Answer

A

Dialogflow is a conversational AI offering from Google Cloud that provides chatbots and voicebots.

Question 43

Q

What do Agent Assist do?

Answer

A

Agent Assist can provide support by identifying intent and providing ready‐to‐send responses and answers from a centralized knowledge base as well as transcript calls in real time.

Question 44

Q

What does Insights do?

Answer

A

Use natural language processing to call drivers (invoke driver software)
Measure sentiment to help leadership understand the call center operations

Question 45

Q

What does Contact Center AI do?

Answer

A

Support multichannel communications between customers and agents.

Question 46

Q

Why is GPU faster than CPU?

Answer

A

A GPU loads a block of memory and applies some operation using the thousands of ALUs in parallel, thereby making it faster.

Question 47

Q

What machine series can use GPUs?

Answer

A

A2 and N1 Machine series

Question 48

Q

Which one is fastest GPU?
NVIDIA_TESLA_T4NVIDIA_TESLA_K80
NVIDIA_TESLA_P4
NVIDIA_TESLA_P100
NVIDIA_TESLA_V100
NVIDIA_TESLA_A100

Answer

A

NVIDIA_TESLA_A100

Question 49

Q

Where do you specify the type of GPU and its number?

Answer

A

Type of GPU: machineSpec.acceleratorType field in WorkerPoolSpec
The number of GPUs: machineSpec.acceleratorCount field in VM (worker pool).

Question 50

Q

What are the restrictions for using GPUs?

Answer

A

Not all types of GPUs are available in all regions.
Use two or four NVIDIA TESLA_T4 GPUs on a VM but not three.
The GPU configuration must have sufficient virtual CPUs and memory compared to the machine type that goes with it.

Question 51

Q

What is the characteristic of TPU?

Answer

A

Each TPU has multiple matrix multiply units (MXUs). Each MXU has 128 × 128 multiply/accumulators. Each MXU is capable of performing 16,000 multiply‐accumulate operations in each cycle using the bfloat16 number format.

Question 52

Q

Cloud TPU provides the following TPU configurations:

Answer

A

A single TPU device
A TPU Pod (a group of TPU devices connected by high‐speed interconnects)
A TPU slice (a subdivision of a TPU Pod)
A TPU VM

Question 53

Q

When to use CPUs?

Answer

A

Rapid prototyping that needs flexibility
Models that train fast
Small models that work with small batch size
Custom TensorFlow operations written in C++
Limited by available I/O or the networking bandwidth of the host

Question 54

Q

When to Use GPUs?

Answer

A

Models for which source code does not exist or is too tedious to change
Models with a significant number of custom TensorFlow operations so they need to run at least partially on a CPU
Models with TensorFlow ops that are not available on TPUs
Medium‐to‐large models with medium‐sized batch

Question 55

Q

When to Use TPUs?

Answer

A

Models that have a majority of matrix computations
Models that have no custom TensorFlow operations
Models that train for weeks or months
Large and very large models with very large effective batch sizes

Question 56

Q

Cloud TPUs are not suited to the following workloads:

Answer

A

Programs that require frequent branching (conditional) and dominated element‐wise by algebra.
Sparse data (data that has lot of zeros)
High precision is not well suited for TPUs.
Deep neural networks that contain custom TensorFlow operations written in C++, especially if the custom operations in the main training loop.

Question 57

Q

What is the main bottleneck for TPU?

Answer

A

The main bottleneck when using TPUs is the data transfer between the Cloud TPU and host memory.

Question 58

Q

Prediction can happen in two methods:

Answer

A

online (real-time) and batch (reasonable time).

Question 59

Q

What are the 2 important considerations when provisioning for predictions?

Answer

A

Scaling behavior and machine type

Question 60

Q

What are the three resources you need to consider for configuring a GPU node?

Answer

A

CPU, GPU and memory

Question 61

Q

You have the option of using GPUs to accelerate predictions, but there are some restrictions. What are the restrictions?

Answer

A

TensorFlow SavedModel or custom container designed to take advantage of GPUs. Not for scikit‐learn or XGBoost models.
GPUs are not available in some regions.
Use only one type of GPU DeployedModel resource or BatchPredictionJob
Limited on the number of GPUs you can add depending on machine types

Question 62

Q

How to find the ideal machine type for a custom prediction container?

Answer

A

Deploy that container as a docker container to a Compute Engine instance directly
Benchmark the instance by calling prediction calls until the instance hits 90+ percent CPU utilization.
Determine the queries per second (QPS) cost per hour of different machine types.

Question 63

Q

What is edge TPU for?

Answer

A

The Google‐designed Edge TPU coprocessor accelerates ML inference on these edge devices. A single Edge TPU can perform 4 trillion operations per second (4 TOPS), on just 2 watts of power. This is sold under the brand name of Coral.ai.
Edge TPU is for running ML inferences on edge devices which usually have limited bandwidth and may operate offline.

Question 64

Q

What is ML kit for?

Answer

A

You can train your ML model on Google Cloud (AutoML or a custom model), and deploy the model into your Android or iOS app. The prediction happens in the device (low response times and enable offline prediction).

Brainscape's Knowledge GenomeTM

4. Choosing the Right ML Infrastructure Flashcards

Brainscape's Knowledge Genome^TM