Vertex AI Feature Store Flashcards

1
Q

Describe three key challenges of ML feature management.

A
  • features are hard to share and reuse
  • reliably serving in production with low latency is a challenge
  • inadvertent skew in feature values between training and serving is common.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is Vertex AI Feature Store services?

A

Vertex AI Feature Store is a fully managed solution that provides a centralized repository for
- organizing,
- storing
- serving
machine learning features.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the main benefit of using Vertex AI feature store?

A

By using a central feature store, you can
- efficiently share
- discover
- reuse

ML features at scale,
letting the team increase the speed at which they can develop and deploy new ML applications.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How does Vertex AI feature store allow sharing and reuse ML features across use cases?

A

Feature Store has a centralized feature repository with easy APIs to search and discover features, fetch them for training and serving and managing permissions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How does Vertex AI feature store alleviate training-serving skew?

A

It let’s you compute feature values once and reuse them for both training and serving.
You can also track and monitor for drift and other quality issues.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How does Vertex AI feature store serve ML features at scale with low latency?

A

The operational overhead is handled by Feature Store.
With Vertex AI Feature Store, the team can store features with batch and stream import APIs and register the feature to its feature registry.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What Vertex AI API allows you to easily find a feature?

A

Discovery API

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is Vertex AI Feature Store?

A

A feature store is a top-level container for your features and their values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is Feature Store entity type?

A
  • An entity type is a collection of semantically related features.
  • You define your own entity types based on the concepts that are relevant to your use case.
  • An entity is an instance of an entity type
  • each entity must have a unique ID and must be of type STRING
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How many entity types can you get for a serving request?

A

For online serving requests, you can get all or a subset of features for a particular entity type.
For batch serving requests, you can get all or a subset of features for one or more entity types.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How does the Feature store identify feature values for search at serving time?

A

Feature store associates a tuple identifier with each feature value, entity_ID, feature_ID, and timestamp, and which it then uses to look up values at surfing time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What feature timestamp in Feature Storev indicates?

A

The timestamp column indicates when the feature values were generated. In the feature store, the time stamps are an attribute of the feature values, not a separate resource type

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How long features are kept at Feature Store?

A

Feature store keeps feature values up to the data retention limit. This limit is based on a time stamp associated with the feature values, not when the values were imported.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is Feature ingestion?

A

Feature ingestion is the process of importing feature values computed by your feature engineering jobs into a feature store.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is Feature serving?

A

Feature serving is the process of exporting stored future values for training or inference.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

List and describe methods of feature serving in Feature Store.

A

Feature store offers two methods for serving features, batch and online.
- Batch serving is for high-throughput and serving large volumes of data for offline processing, like for model training or batch predictions.
- Online serving is for low-latency data retrieval of small batches of data for real-time processing, like for online predictions.

17
Q

What needs to be done before creating feature store?

A

Before creating a feature store, you’ll need to preprocess your data. Ensure that your features are clean and tidy, which means that there are no missing values, data types are correct, and any one-hot encoding of categorical values has already been done.

18
Q

What are the requirements for source data to be ingested?

A

Vertex AI Feature Store can ingest data from tables in BigQuery or files in Cloud Storage,
- for files in Cloud Storage, they must be in the Avro or CSV format.
- You must have a column for entity IDs, and the values must be of type STRING (This column contains the entity IDs that the feature values are for. )
- Your source data values or your source data value types must match the value types of the destination feature in the feature store.
- All columns must have a header that is of type STRING. There are no restrictions on the names of the headers.
– For BigQuery tables, the column header is the column name.
– For Avro, the column header is defined by the Avro schema that is associated with the binary data.
– For CSV files, the column header is the first row.

19
Q

What are the requirements for timestamp columns?

A

Use one of the following timestamp formats.
- For BigQuery tables, timestamps must be in the TIMESTAMP column.
- For Avro, timestamps must be of type long and logical time or logical type timestamp-micros.
- For CSV files, timestamps must be in the RFC 3339 format.

20
Q

CSV files cannot include array data types. Use Avro or BigQuery instead.

A

True

21
Q

For array types, you cannot include a null value in the array although you can include an empty array.

A

True

22
Q

What is the minimum number of rows in a dataset required for a data set to be uploaded into Vertex AI?

A

1,000 rows

23
Q

Where is data saved when a user ingests feature values via the batch ingestion API?

A

When a user ingests feature values via the batch ingestion API, the data is reliably written both to an offline store and to an online store.
- The offline store will retain feature values for a long time so that they can later be retrieved for training.
- The online store will contain the latest feature values for online predictions.

24
Q

What is The online serving API used for?

A

The online serving API will be used by client applications to fetch feature values to perform online predictions.

25
Q

What is the batch serving API used for?

A

The batch serving API is used to fetch data from the offline store for training a model or for performing batch predictions.

26
Q

To fetch the appropriate feature values for training, the batch serving API performs point in time lookups.

A

True

27
Q

Batch serving jobs must be created in the Feature Store API.

A

True