14. BigQuery ML Flashcards

1
Q

Who is the target audience of BigQuery ML?

A

Data analysts and others who are familiar with SQL prefer to use BigQuery ML instead of other methods.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the three ways to access data in BigQuery?

A

Web console to write a SQL query
Use magic command %%bigquery in Jupyter Notebook
Use a Python API to run the same query in Jupyter Notebook using the Python API

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Is BigQuery ML serverless?

A

Yes. BigQuery ML is a completely serverless method to train and predict.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the keywords for create models in BigQuery ML?

A

CREATE MODEL, CREATE MODEL IF NOT EXISTS, CREATE OR REPLACE MODEL

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the two optional commands after the CREATE MODEL keyword?

A

model_type, input_label_cols

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What model categories does BigQuery ML support?

A

Regression: LINEAR_REG, BOOSTED_TREE_REGRESSOR, DNN_REGRESSOR, AUTOML_REGRESSION
Classification: LOGISTIC_REG, BOOSTED_TREE_CLASSIFIER, DNN_CLASSIFIER, DNN_LINEAR_COMBINED_CLASSIFIER, AUTOML_CLASSIFIER
Deep and wide neural network (recommendation systems and personalization): DNN_LINEAR_COMBINED_REGRESSOR, DNN_LINEAR_COMBINED_CLASSIFIER
Clustering: KMEANS
Collaborative filtering: MATRIX_FACTORIZATION
Dimensionality reduction: PCA, AUTOENCODER
Time-series forecasting: ARIMA_PLUS
General: TensorFlow

Hints: Curious Cat Discovers Really Cool Treasures Near Trees

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the keyword for model evaluation in Bigquery ML?

A

ML.EVALUATE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the two levels BigQuery ML explainability?

A

Model level and individual prediction level

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the keyword for prediction in Bigquery ML?

A

ML.PREDICT

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the statement for querying explanations?

A

ML.GLOBAL_EXPLAIN(MODEL ` model1`)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the statement for enabling global explanations during model training?

A

enable_global_explain=TRUE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the explainability methods for different model types?

A

Linear and logistic regression: Shapley values and standard errors, p‐values
Boosted Trees: Tree SHAP, Gini‐based feature importance
Deep Neural Network and Wide-and-Deep: Integrated gradients
Arima_PLUS: Time-series decomposition

Hints: Railroad Tales Nurturing Towns

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Compare BigQuery ML and Vertex AI Tables

A

BigQuery is a serverless data warehouse
Users are SQL experts.
Use BigQuery_scheduled queries for automation
Use Looker for visualization

Vertex AI is for data scientists
Use Jupyter Notebooks and Pandas DataFrames.
Need fine-grained control over the workflow.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the six integration points for Vertex AI and BigQuery ML?

A

Access BigQuery public dataset from Vertex AI.
Import BigQuery data into Vertex AI.
Access BigQuery data from Vertex AI Workbench Notebooks (directly browse your BigQuery dataset)
Export batch prediction data to BigQuery for further analysis
Export BigQuery Models into Vertex AI (GCS to Vertex AI or Model Registry)

Hints: Mysterious Island Whispers: Pirates Never Mined Patiently

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is hashed feature?

A

It addresses three problems:
Incomplete vocabulary (values not fall into current categories)
High cardinality (e.g., zip code)
Cold start problem: New categories (e.g., new staff ID).
Transform this high cardinal variable into a low cardinal domain by hashing, e.g., FarmHash: ABS(MOD(FARM_FINGERPRINT(zipcode), numbuckets))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Give a few examples of data transformation in BigQuery.

A

POLYNOMIAL_EXPAND, FEATURE_CROSS, NGRAMS, QUANTILE_BUCKET, HASH_BUCKETIZE, MIN_MAX_SCALER and STANDARD_SCALER