13. Maintaining ML Solutions Flashcards

1
Q

What are the steps in ML?

A

Data:
Extraction (from sources)
Analysis (EDA)
Preparation (transform and feature engineering)
Model:
Training (get the best model)
Evaluation (assess the model quality)
Validation (meet a predefined performance metrics)
Deployment (online & batch):
Serving (RESTful endpoint)
Monitor (Detect anomalies, drift & skew)

Hints:
Data: Elephants Are Playful
Model: Tigers Enjoy Vegetation During Sunny Mornings

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the three levels of MLOps?

A

Level 0: Manual Phase
Level 1: Strategic automation phase
Level 2: CI/CD automation, transformational phase

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the key features of Level 0?

A

Manual
ML and MLOps are different teams
No CI/CD/CT
No deploying an entire ML system

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the key features of Level 1?

A

Orchestrated experimentation
CT
Experiment-operational symmetry
Modular components
CD
Pipeline deployment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the considerations for triggering retraining?

A

Training costs
Training time
Delayed training
Scheduled training

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the key features of Level 2?

A

Pipeline
CI/CD

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the triggers for retraining?

A

Absolute threshold
Rate of degradation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the problems for not having a centralised feature store?

A

Non-reusable: Features created not shared
Governance: Features created by different sources not governed
Cross-collaboration: Features not being shared continue to go separately.
Training and serving differences: Differences may exist between training and serving data.
Productizing features: Lack of automation in features used in experimentation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is model versioning for?

A

Deploy an additional model to the existing model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the two key features of Feature Store?

A

Process large feature sets quickly
Access the features with low latency for real-time and batch predictions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Is Vertex AI Feature Store a managed service and scale dynamically?

A

Yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What model does Feature Store use to store all the data?

A

Time-series

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the hierarchy of featurestore?

A

Featurestore > EntityType > Feature

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the two types of ingestions supported by Feature Store?

A

Batch and streaming ingestion, e.g., BigQuery to Feature Store.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the two types of retrieving supported by Feature Store?

A

Batch and online.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the best practices to use IAM security?

A

Least privilege
Actively manage service accounts and service account keys
Enable auditing
Check policy management

13
Q

What service do you use to manage permissions to perform various operations?

A

Identity and Access Management (IAM)

14
Q

What is the specific uses of IAM in Vertex AI?

A

Google automatically creates several service accounts for Google Cloud Projects. They may have more permissions than required. Use custom service accounts.

15
Q

What is Access Transparency in Vertex AI?

A

You need logs to track what content and who is accessing it. They may be legal and compliance requirements.
There are two types of access logs. Cloud Audit logs are logs of users from your organisation and Access Transparency logs are logs of Google personnel.

16
Q

What are the common training errors?

A

Input data not transformed or encoded
Tensor shape mismatched
Out of memory errors because of instance size

17
Q

What are the common serving errors?

A

Input data not transformed or encoded
Signature mismatched

18
Q

What are the ways to prevent and reduce training and serving errors?

A

Compute statistics
Infer schema
Detect anomalies

19
Q

What does Vertex AI provide to debug training for both pre-built and custom containers?

A

Interactive shell

20
Q

What can you inspect with interactive shell during training?

A

Run tracing and profiling tools
Analyze GPU utilization
Validate IAM permissions for the container