MLE Flashcards

Question

What file formats are should be used for model artifacts from a Vertex AI pre-built container?

Answer 1

1.) TensorFlow: saved_model.pb 2.) Scikit-Learn: model.joblib or model.pkl 3.) XGBoost: model.bst 4.) PytTorch: model.pth

Answer 2

1.) Specify the number and type of machines you need? 2.) Plan inputs to the model using batch or online serving techniques. 3.) Turn on auto-scaling by defining the minimum and maximum nodes with a bare minimum of 2 nodes.

Answer 3

Prediction of batches of data brought in at a regular interval. Requests are asynchronous and come directly from the model. Requires an input source and output location of either GCS or BigQuery. Can also be done by reading batch features with the Feature Store API but this would be slower as features would need to be ingested.

Answer 4

1.) Use Vertex AI pipelines for running DAGS created by Kubeflow, TFX and Airflow. 2.) Use Kubeflow pipelines to author your pipelines.

Answer 5

1.) Organize ML artifacts. 2.) Use version control for pipeline and custom component code.

Answer 6

- Vertex AI Workbench notebooks - Pipeline source code - Preprocessing functions - Model source code - Model training packages - Serving functions

Answer 7

An artifact is the output resulting from each step of a ML workflow.

Answer 8

- Experiments - Hyperparameters - Metaparameters - Metrics - Data Artifacts - Model Artifacts - Pipeline Metadata

Answer 9

Trained Models from AutoML, Custom training or BigQueryML. They can be versioned

Answer 10

- Pipeline containers - Custom training environments - Custom prediction environments

Answer 11

Deployed models

Answer 12

1.) Use drift and skew detection at an endpoint. It uses TFDV under the hood to determine data drift and skew. 2.) Fine tune alert thresholds. 3.) Use feature attributions as an early warning sign for data drift or skew through Vertex Explainable AI.

Answer 13

The degree of distortion between your training data and production data?

Answer 14

The process at which data drifts over time changing the underlying statistical distribution of inputs and target.

Answer 15

A synchronous request made to a model endpoint for serving predictions with low latency and/or streaming data.

Answer 16

1.) Have fixed thresholds for optimizing metrics and satisficing metrics like latency and model size. 2.) Implement an evaluation routine that is model indifferent. 3.) Ensure you have a baseline model to compare against. 4.) Track every experiment and incremental improvement.

Answer 17

1.) Address class imbalance early. 2.) Automate data preprocessing. 3.) Prevent data leakage with a test-train split that isolates test data from the tuning process. 4.) Generate a data schema that includes feature statistics. 5.) Ensure training data is properly shuffled in batches. 6.) Use a validation set for model/hyperparameter tuning.

Answer 18

1.) For DNN's, monitor for NaN values in loss and percentage of weights as this can indicate errors or vanishing/exploding gradients. 2.) Use validation and test data to check for overfitting/underfitting 3.) Analyze misclassified instances to check for mislabeling, outliers or pre-processing that is needed. 4.) Analyze feature importance and remove those that have little importance.

Answer 19

1.) Verify features match the expected schema. 2.) Verify data is in expected ranges and distributions. 3.) Validate the maximum fraction of missing values.

Answer 20

1.) Validate the model on unseen test data. 2.) Ensure the test data is representative of the data and that time series test data is fresher than train.

Answer 21

1.) Verify the model can be called. 2.) Validate satisficing requirements. 3.) Unit test the model for edge cases and typical cases. 4.) Test in a staging environment where you can roll back to a previous version if needed. 5.) Use A/B or multi-armed bandit testing before fully rolling out a new model.

Answer 22

1.) Regularly profile request data for tracking data drift or skew and set alerts for skew/drift thresholds. 2.) Identify concept drift by checking how feature importance changes over time. 3.) Determine outliers with respect to the training data. 4.) Perform continuous evaluation where true labels are available. 5.) Monitor service efficiency. 6.) Monitor predictive performance.

Answer 23

1.) Integration (CI): Testing and validating code, components, data, schmas and models. 2.) Delivery (CD): Deployment of an end-to-end pipeline that automatically pushes to a prediction service. 3.) Training (CT): Models are automatically retrained and served as they improve.

Answer 24

Completely manual process that is script driven and experimental. It has no CI, CD or CT.

Answer 25

A step up from level 0 with CT integration. It needs automated data and model validation.

Answer 26

It includes integration of CI/CD on top of CT for rapid automation of pipeline experimentation and integration. It requires source control, test/build services, deployment services, model registry, feature store, ml metadata store and ml pipeline orchestration which are all automated.

Answer 27

Vertex AI pipelines is a managed resource for MLOps. It supports both KubeFlow Pipelines and Tensor Flow Extended frameworks but it manages the compute cluster for you in a containerized environment.

Answer 28

When the pipeline you are creating is running tensorflow code.

Answer 29

When not using a tensorflow code or off premises/multicloud solutions are needed.

Answer 30

BQML can natively serve batch predictions but by integrating into Vertex AI it can be deployed to an endpoint through Model Registry and perform online prediction. This deployment does not work for ARIMA+ or XGBoost models.

Answer 31

A black-box optimization service that helps tune hyperparameters. Done when the objective/loss function function is unknown or too costly to use. By default it uses Bayesian Optimization but can also use Grid Search, Random Search or an unspecified mode that chooses a best solution.

Answer 32

The process by which AutoML searches for and finds the best model architecture for a given problem and tunes its parameters.

Answer 33

An intelligent cloud data service to visually explore, clean, and prepare data for analysis and machine learning. IT auto detects 17 different data types and can transform structured or unstructured data in CSV, JSON or relational tables up to petabytes.

Answer 34

1.) Direct Upload/Download 2.) GCS 3.) BigQuery

Answer 35

Classification/Regression, Forecasting

Answer 36

Classification, entity extraction, sentiment analysis

Answer 37

Action Recognition, classification, object tracking.

Answer 38

Classification, Object Detection

Answer 39

A Vertex AI environment variables specifying the cluster used for running a distributed training job in Vertex AI/Tensorflow. They need a primary replica/chief which manages the cluster, workers which perform the training, parameter servers (if using ParameterServerStrategy) or evaluators. CLUSTER_SPEC is set for the full cluster and TF_CONFIG is set on each replica of a training job for multiple replica jobs.

Answer 40

Data is split and used to train different models. The overall model is updated asynchronously (allreduce) or synchronously (Parameter Serving)

Answer 41

For large models, weights are split across multiple devices and each device trains part of a model.

Answer 42

A simple asynchronous training strategy where multiple GPUs can be used on one machine. IT creates one replica per GPU and trains. The results are allreduced together at each update step.

Answer 43

An asynchronous strategy that scales MirroredStrategy horizontally by replicating jobs across multiple workers/machines. It requires the TF_CONFIG variable to work

Answer 44

A distributed training strategy that uses TPUs to implement the MirroredStrategy.

Answer 45

A synchronous model training strategy that uses multiple machines. The parameter server is a cental co-ordinator that save checkpoints, distributes data and updates weights from workers as they are input. It requires the TF_CONFIG variable and TFConfigClusterResolver to define cluster organization.

Answer 46

a Tensor Processing Unit that can perfrom allreduce based asynchronous training. It can cause "data bottlenecks" if data size is not properly considered as they are extremely fast. Therefor this requires a balance of file number vs file size to avoid network overhead. Can only read from GCS.

Answer 47

An asynchronous training strategy that does not mirror variables and all operations are replicated across local GPUs.

Answer 48

- Model is dominated by matrix computations - Model has no custom TF/PyTorch/JAX operations in main training loop. - Model trains for weeks or more. - Model is large and has effective batch sizes.

Answer 49

- Linear Algebra is frequently branching or has element wise operations. - Workloads access memory in a sparse manner. - Workload requires high-precision arithmetic. - Neural Network ahs custom training operations in the main training loop.

Answer 50

Precision measures fraction of relevant positives over retrieved positives (TP/(TP + FP)) while recall measures the fraction of relevant positives over the number of expected positives (TP/(TP + FN)). F1 seeks to balance with the harmonic mean of both.

Answer 51

A vertex AI offering that allows you to quickly test and customize language, vison and speech models.

Answer 52

1.) AI for healthcare: Generates healthcare analytics 2.) Discovery AI for retail

Answer 53

1.) Contact Center AI / Dialogflow 2.) Document AI

Answer 54

IT is a managed service that streamlines ML feature management. It act as a layer between BQ data that serves the latest features at low latency. It registers multiple BQ tables or views and serves the freshest data based on timestamp.

Answer 55

TFX is an extension of tensor flow that allows the build of ML pipelines for a production environment. It is supported by Vertex AI pipelines to allow cloud native pipeline operation.

Answer 56

1.) ExmpleGen - Ingests and optionally splits data 2.) StatisticsGen - Calculates statistics on a dataset. 3.) SchemaGen - Examines statistics and creates a schema. 4.) ExampleValidator - Uses schema and statistics to find anomalies + missing values. 5.) Transform - Performs feature engineering 6.) Trainer - Trains the model 7.) Tuner - Tunes Hyperparameters 8.) Evaluator - Performs deep analysis of the training results 9.) InfaValidator - Checks if the model is servable 10.) Pusher - Deploys the model to serving infrastructure 11.) BulkInferrer - Performs batch predictions with a trained model.

Answer 57

Airflow, Kubeflow or Vertex AI Pipelines.

Answer 58

Have a setup.py in your root directory containing the requirements of the program. Have a trainer/ folder containing task.py that is the entry point to evoke the model.py file. Have an __init__.py file in every sub directory to make the module a package.

Answer 59

Use the TRANSFORM clause. This can be used for general imputation, numerical normalization, scaling and bucketing, categorical encoding/crossing, text tokenization/vectorization and image manipulation.

Answer 60

BQ automatically performs imputation, numeric standardization (most models), one-hot encoding, multi-hot encoding (arrays), Timestamp transformation and struct expansion.

Answer 61

Using ML.PREDICT, ML.FORCAST, ML.RECOMMEND, ML.DETECT_ANOMALIES

Answer 62

It can determine test entity types, analyze sentiment, annotate text (all features), classify text and moderate text. All to pre-trained set of solutions.

Answer 63

It can perform synchronous, asynchronous or real time transcription of specified language speech.

Answer 64

Take written text and convert it to speech with a pre-set list of voices.

Answer 65

Allows dynamically translated text. Cloud Translation uses a Google pre-trained or a custom machine learning model to translate text with 100+ language pairs.

Answer 66

Cloud Vision allows developers to easily integrate vision detection features within applications, including image labeling, face and landmark detection, optical character recognition (OCR), and tagging of explicit content.

Answer 67

Stored video analysis, streaming video analysis, object detection and tracking, logo recognition, face detection, person detection, video annotation.

Answer 68

Digitizing documents for e-readers, optical character recognition, image recognition, entity extraction, NLP, document classification, key-value pair recognition, translation, normaliztion.

Answer 69

General, specialized, custom.

Answer 70

When feature selection is needed along with regularization to prevent overfitting.

Answer 71

When features are collinear/co-dependent and so removing with L1 could be detrimental.

Answer 72

To prevent overfitting.

Answer 73

An efficient API for TF import of various data types.

Answer 74

1.) TextLineDataset - import lines from text files. 2.) TFRecordDataset - import TF record data 3.) FixedLengthRecordDataset - import records from binary files.

Answer 75

End to end service for building customer recommendation systems.

Answer 76

A system for storing, synthesizing, de-identifying and analyzing healthcare data. Supports DICOM (digital imaging and communications in medicine), HL7v2 (event messaging service) and FHIR (Fast healthcare interoperability sources)

Answer 77

A system built on Dialogflow for building managed AI chat bots.

Answer 78

Tabular Workflow for End-to-End AutoML is a complete AutoML pipeline for classification and regression tasks. It is similar to the AutoML API, but allows you to choose what to control and what to automate. Instead of having controls for the whole pipeline, you have controls for every step in the pipeline. These pipeline controls include: Data splitting Feature engineering Architecture search Model training Model ensembling Model distillation

Answer 79

The cold start problem occurs when the recommender system lacks sufficient information to make reliable predictions or suggestions for a user or an item.

Answer 80

1.) Time series data 2.) Data groupings 3.) Burst data

Answer 81

A mismatch in the data that was trained vs used for prediction. Often due to processing issues, poor assumptions or sampling issues.

Answer 82

Change in the statistical properties of data over time.

Answer 83

- Manage your datasets in a central location. - Easily create labels and multiple annotation sets. - Create tasks for human labeling using integrated data labeling. - Track lineage to models for governance and iterative development. - Compare model performance by training AutoML and custom models using the same datasets. - Generate data statistics and visualizations. - Automatically split data into training, test, and validation sets.

Answer 84

Store and maintain your offline feature data in BigQuery, taking advantage of the data management capabilities of BigQuery. Share and reuse features by adding them to the feature registry. Serve features for online predictions at low latencies using Bigtable online serving or at ultra-low latencies using Optimized online serving. Store embeddings in your feature data and perform vector similarity searches. Track feature metadata in Dataplex.

Answer 85

Like BigQuery ML ARIMA_PLUS, Prophet attempts to decompose each time series into trends, seasons, and holidays, producing a forecast using the aggregation of these models' predictions. An important difference, however, is that BQML ARIMA+ uses ARIMA to model the trend component, while Prophet attempts to fit a curve using a piecewise logistic or linear model.

Answer 86

- At most 100 times move videos of the most common label to the least common one. - 100 or more training video frames per label are recommended. - For video frame resolution much larger than 1024 pixels by 1024 pixels, some image quality may be lost during the frame normalization process used by Vertex AI.

Answer 87

Colab: Project priorities are collaboration, experimentation, and avoiding spending time setting up infrastructure. Workbench: Priorities are control and customization and pipeline development.

Answer 88

TabNet uses sequential attention to choose which features to reason from at each decision step. This promotes interpretability and more efficient learning because the learning capacity is used for the most salient features. It trains classification and regression models.

Answer 89

Wide & Deep jointly trains wide linear models and deep neural networks. It combines the benefits of memorization and generalization.

Answer 90

A kernel hosts a jupyter notebook session.

Answer 91

By setting up acluser and running the a notebook in it. Requires Dataproc Worker (roles/dataproc.worker) on your project Dataproc Editor (roles/dataproc.editor) on the cluster for the dataproc.clusters.use permission.

Answer 92

using a custom container.

Answer 93

With a single hypothesis to 95% significance using a load balancer.

Answer 94

Running a production version and a mirror of it that has all requests replayed in it. Helps to check performance without taking affecting customers. It does however cost more and needs to be handled carefully to avoid bugs like overcharging, etc.

Answer 95

gradual rollout of a feature as performance is evluated in situ. It is however slow and needs substantial observability/monitoring.

Answer 96

By semantically embedding data you can map data to somantically similar groups, descriptions, queries or images.

Answer 97

Vertex ML Metadata lets you: Analyze runs of a production ML system to understand changes in the quality of predictions. Analyze ML experiments to compare the effectiveness of different sets of hyperparameters. Track the lineage of ML artifacts, for example datasets and models, to understand just what contributed to the creation of an artifact or how that artifact was used to create descendant artifacts. Rerun an ML workflow with the same artifacts and parameters. Track the downstream usage of ML artifacts for governance purposes.

Answer 98

You request a batchPredictionsJob directly from the model resource without needing to deploy the model to an endpoint.

Answer 99

Before sending a request, you must first deploy the model resource to an endpoint. This associates compute resources with the model so that it can serve online predictions with low latency.

Answer 100

Avoid sigmoid for internal laters and use ReLu or advanced ReLU nonlinearities.

Answer 101

Use gradient clipping. (clipnorm/clipvalue)

Answer 102

When your team manage only a few models, are still experimenting or the models are modified infrequently.

Answer 103

1.) Design features with disclosures built in. 2.) Consider giving a few answers and let the user decide. 3.) Model potential adverse feedback and have a iterative roll out plan. 4.) Engauge with a diverse set of users and use feedback to guide further development.

Answer 104

1.) Use metrics form users feedback as well as product performance (click through, customer use sliced across different groups. 2.) Ensure metrics are appropriate for the context.

Answer 105

1.) Check data for mistakes, accuracy, bias and representation. 2.) Check for training-serving skew. 3.) Remove redundant features.

Answer 106

1.) Correlation does not equal cauation. 2.) Models are a reflection of the training data. 3.) Communicate limitations where possible.

Answer 107

1.) Rigourously test each component in isolation. 2.) Conduct integration testing. 3.) Detect input drift. 4.) Use a gold standard test set to ensure models work consistently. 5.) Build quality checks such that unintended failures trigger an immediate response.

Answer 108

Training-Serving Skew: Attribution differs in production than in training. Drift: Attribution changes over time. This is done on an online prediction endpoint in Vertex AI for Tabular data.

Answer 109

Periodic evaluation of your model on new, incoming data to determine if model performance is degrading.

Answer 110

1.) Set up the GCP environment. Determine if using TFX or Kubeflow. 2.) Design the pipeline around the model you plan to execute. 3.) If using KFP compile to a .yaml file. 4.) run the pipeline (kfp = job.submit(), tfx = tfx.orchestration.{runner}.run()

Answer 111

Vertex AI Pipelines is a Google Cloud managed service that allows you to orchestrate and automate ML pipelines where each component of the pipeline can run containerised on Google Cloud or other cloud platforms.

Answer 112

- A user interface for managing and tracking experiments, jobs, and runs. - An engine for scheduling multistep ML workflows. - A Python SDK for defining and manipulating pipelines and components. - Integration with [Vertex ML Metadata] to save information about executions, models, datasets, and other artifacts.

Answer 113

- Cloud Scheduler is publishing messages on a schedule and therefore triggering the pipeline. - Cloud Composer is publishing messages as part of a larger workflow, for example a data ingestion workflow that triggers the training pipeline after new data are ingested in BigQuery. - Cloud Logging publishes a message based on logs that meet some filtering criteria. You can set up the filters to detect the arrival of new data or even skew and drift alerts generated by the Vertex AI Model Monitoring service. Note: This can also be done with the scheduler api.

Answer 114

Cloud build can be automatically or manually triggered to clone your repo, run unit and integration tests, build ML images (containers), compile the pipeline, upload to artifact registry and run.

Answer 115

- Analyze runs of a production ML system to understand changes in the quality of predictions. - Analyze ML experiments to compare the effectiveness of different sets of hyperparameters. - Track the lineage of ML artifacts, for example datasets and models, to understand just what contributed to the creation of an artifact or how that artifact was used to create descendant artifacts. - Rerun an ML workflow with the same artifacts and parameters. - Track the downstream usage of ML artifacts for governance purposes.

Answer 116

Monitoring experiments.

MLE Flashcards

(142 cards)