Quizlet Flashcards

https://quizlet.com/618014981/gcp-flash-cards/

1
Q

You are building an ML model to detect anomalies in real-time sensor data. You will use Pub/Sub to handle incoming requests. You want to store the results for analytics and visualization. How should you configure the pipeline?
A. 1 = Dataflow, 2 = AI Platform, 3 = BigQuery
B. 1 = DataProc, 2 = AutoML, 3 = Cloud Bigtable
C. 1 = BigQuery, 2 = AutoML, 3 = Cloud Functions
D. 1 = BigQuery, 2 = AI Platform, 3 = Cloud Storage

A

A

Dataflow is newer/better than Dataproc, and works well with Pub/Sub
AI Platform allows custom ML model building
BigQuery is best option for storing analytics data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Your organization wants to make its internal shuttle service route more efficient. The shuttles currently stop at all pick-up points across the city every 30 minutes between 7 am and 10 am. The development team has already built an application on Google Kubernetes Engine that requires users to confirm their presence and shuttle station one day in advance. What approach should you take?
A. 1. Build a tree-based regression model that predicts how many passengers will be picked up at each shuttle station. 2. Dispatch an appropriately sized shuttle and provide the map with the required stops based on the prediction.
B. 1. Build a tree-based classification model that predicts whether the shuttle should pick up passengers at each shuttle station. 2. Dispatch an available shuttle and provide the map with the required stops based on the prediction.
C. 1. Define the optimal route as the shortest route that passes by all shuttle stations with confirmed attendance at the given time under capacity constraints. 2. Dispatch an appropriately sized shuttle and indicate the required stops on the map.
D. 1. Build a reinforcement learning model with tree-based classification models that predict the presence of passengers at shuttle stops as agents and a reward function around a distance-based metric. 2. Dispatch an appropriately sized shuttle and provide the map with the required stops based on the simulated outcome.

A

C

Since everyone confirm their presence one day in advance, there is no need for ML prediction. Also, the shuttle only needs to cover stops with people.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

You were asked to investigate failures of a production line component based on sensor readings. After receiving the dataset, you discover that less than 1% of the readings are positive examples representing failure incidents. You have tried to train several classification models, but none of them converge. How should you resolve the class imbalance problem?
A. Use the class distribution to generate 10% positive examples.
B. Use a convolutional neural network with max pooling and softmax activation.
C. Downsample the data with upweighting to create a sample with 10% positive examples.
D. Remove negative examples until the numbers of positive and negative examples are equal.

A

C

Resampling is a good technique to handle class imbalance.
Choice A is wrong since the class distribution generate the same distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

You want to rebuild your ML pipeline for structured data on Google Cloud. You are using PySpark to conduct data transformations at scale, but your pipelines are taking over 12 hours to run. To speed up development and pipeline run time, you want to use a serverless tool and SQL syntax. You have already moved your raw data into Cloud Storage. How should you build the pipeline on Google Cloud while meeting the speed and processing requirements?
A. Use Data Fusions GUI to build the transformation pipelines, and then write the data into BigQuery.
B. Convert your PySpark into SparkSQL queries to transform the data, and then run your pipeline on Dataproc to write the data into BigQuery.
C. Ingest your data into Cloud SQL, convert your PySpark commands into SQL queries to transform the data, and then use federated queries from BigQuery for machine learning.
D. Ingest your data into BigQuery using BigQuery Load, convert your PySpark commands into BigQuery SQL queries to transform the data, and then write the transformations to a new table.

A

D

Choice A is wrong since Data Fusions is not SQL syntax
PySpark is already not performed at scale, so we need something else (BigQuery)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

You manage a team of data scientists who use a cloud-based backend system to submit training jobs. This system has become very difficult to administer, and you want to use a managed service instead. The data scientists you work with use many different frameworks, including Keras, PyTorch, theano, Scikit-learn, and custom libraries. What should you do?
A. Use the AI Platform custom containers feature to receive training jobs using any framework.
B. Configure Kubeflow to run on Google Kubernetes Engine and receive training jobs through TF Job.
C. Create a library of VM images on Compute Engine, and publish these images on a centralized repository.
D. Set up Slurm workload manager to receive jobs that can be scheduled to run on your cloud infrastructure.

A

A

Because AI platform supported all the frameworks mentioned.

Choice B is wrong because Kubeflow is not managed service in GCP.
Choice D is wrong because Slurm is not managed service.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

You work for an online retail company that is creating a visual search engine. You have set up an end-to-end ML pipeline on Google Cloud to classify whether an image contains your companys product. Expecting the release of new products in the near future, you configured a retraining functionality in the pipeline so that new data can be fed into your ML models. You also want to use AI Platforms continuous evaluation service to ensure that the models have high accuracy on your test dataset. What should you do?
A. Keep the original test dataset unchanged even if newer products are incorporated into retraining.
B. Extend your test dataset with images of the newer products when they are introduced to retraining.
C. Replace your test dataset with images of the newer products when they are introduced to retraining.
D. Update your test dataset with images of the newer products when your evaluation metrics drop below a pre-decided threshold.

A

B

The test dataset should contain both new and old products

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

You need to build classification workflows over several structured datasets currently stored in BigQuery. Because you will be performing the classification several times, you want to complete the following steps without writing code: exploratory data analysis, feature selection, model building, training, and hyperparameter tuning and serving. What should you do?
A. Configure AutoML Tables to perform the classification task.
B. Run a BigQuery ML task to perform logistic regression for the classification.
C. Use AI Platform Notebooks to run the classification model with pandas library.
D. Use AI Platform to run the classification model job configured for hyperparameter tuning.

A

A

Because of the no code requirements

Choice B is wrong because BigQuery ML need to write code.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

You work for a public transportation company and need to build a model to estimate delay times for multiple transportation routes. Predictions are served directly to users in an app in real time. Because different seasons and population increases impact the data relevance, you will retrain the model every month. You want to follow Google-recommended best practices. How should you configure the end-to-end architecture of the predictive model?
A. Configure Kubeflow Pipelines to schedule your multi-step workflow from training to deploying your model.
B. Use a model trained and deployed on BigQuery ML, and trigger retraining with the scheduled query feature in BigQuery.
C. Write a Cloud Functions script that launches a training and deploying job on AI Platform that is triggered by Cloud Scheduler.
D. Use Cloud Composer to programmatically schedule a Dataflow job that executes the workflow from training to deploying your model.

A

A
Because it is a complete solution

Choice B is wrong because BigQuery ML is usually for batch prediction, and it needs additional infrastructure
Choice C is wrong because Cloud Scheduler and Cloud Function is need enough to ensure the pipeline robust
Choice D is wrong because Dataflow is for data pipeline only

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

You are developing ML models with AI Platform for image segmentation on CT scans. You frequently update your model architectures based on the newest available research papers, and have to rerun training on the same dataset to benchmark their performance. You want to minimize computation costs and manual intervention while having version control for your code. What should you do?
A. Use Cloud Functions to identify changes to your code in Cloud Storage and trigger a retraining job.
B. Use the gcloud command-line tool to submit training jobs on AI Platform when you update your code.
C. Use Cloud Build linked with Cloud Source Repositories to trigger retraining when new code is pushed to the repository.
D. Create an automated workflow in Cloud Composer that runs daily and looks for changes in code in Cloud Storage using a sensor.

A

C

Because Cloud Build can linked directly to Git push/commit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Your team needs to build a model that predicts whether images contain a drivers license, passport, or credit card. The data engineering team already built the pipeline and generated a dataset composed of 10,000 images with drivers licenses, 1,000 images with passports, and 1,000 images with credit cards. You now have to train a model with the following label map: [˜drivers_license, ˜passport, ˜credit_card]. Which loss function should you use?
A. Categorical hinge
B. Binary cross-entropy
C. Categorical cross-entropy
D. Sparse categorical cross-entropy

A

C

because categorical crossentropy when one sample can have multiple classes or labels are soft probabilities (like [0.5, 0.3, 0.2]).

Choice B is wrong because there are multiple categories

Choice D is wrong because sparse categorical crossentropy when your classes are mutually exclusive (e.g. when each sample belongs exactly to one class)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

You are designing an ML recommendation model for shoppers on your companys ecommerce website. You will use Recommendations AI to build, test, and deploy your system. How should you develop recommendations that increase revenue while following best practices?
A. Use the Other Products You May Like recommendation type to increase the click-through rate.
B. Use the Frequently Bought Together recommendation type to increase the shopping cart size for each order.
C. Import your user events and then your product catalog to make sure you have the highest quality event stream.
D. Because it will take time to collect and record product data, use placeholder values for the product catalog to test the viability of the model.

A

B

Because Frequently Bought Together helps upsell
https://cloud.google.com/retail/docs/models#fbt

Choice A is wrong because it doesn’t increase the click through rate.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

You are designing an architecture with a serverless ML system to enrich customer support tickets with informative metadata before they are routed to a support agent. You need a set of models to predict ticket priority, predict ticket resolution time, and perform sentiment analysis to help agents make strategic decisions when they process support requests. Tickets are not expected to have any domain-specific terms or jargon.
The proposed architecture has the following flow:

Which endpoints should the Enrichment Cloud Functions call?
A. 1 = AI Platform, 2 = AI Platform, 3 = AutoML Vision
B. 1 = AI Platform, 2 = AI Platform, 3 = AutoML Natural Language
C. 1 = AI Platform, 2 = AI Platform, 3 = Cloud Natural Language API
D. 1 = Cloud Natural Language API, 2 = AI Platform, 3 = Cloud Vision API

A

C

Since the ticket don’t have any domain specific terms, Cloud Natural Language API can be used to perform sentiment analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

You have trained a deep neural network model on Google Cloud. The model has low loss on the training data, but is performing worse on the validation data. You want the model to be resilient to overfitting. Which strategy should you use when retraining the model?
A. Apply a dropout parameter of 0.2, and decrease the learning rate by a factor of 10.
B. Apply a L2 regularization parameter of 0.4, and decrease the learning rate by a factor of 10.
C. Run a hyperparameter tuning job on AI Platform to optimize for the L2 regularization and dropout parameters.
D. Run a hyperparameter tuning job on AI Platform to optimize for the learning rate, and increase the number of neurons by a factor of 2.

A

C

Choice A, B is wrong because these numbers are chosen arbitrarily

Choice D is wrong because optimize the learning rate doesn’t solve the overfitting issue

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

You built and manage a production system that is responsible for predicting sales numbers. Model accuracy is crucial, because the production model is required to keep up with market changes. Since being deployed to production, the model hasn’t changed; however the accuracy of the model has steadily deteriorated.
What issue is most likely causing the steady decline in model accuracy?
A. Poor data quality
B. Lack of model retraining
C. Too few layers in the model for capturing information
D. Incorrect data split ratio during model training, evaluation, validation, and test

A

B

Choice A, C, D are wrong because they impact the model itself, not the model drift issue

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

You have been asked to develop an input pipeline for an ML training model that processes images from disparate sources at a low latency. You discover that your input data does not fit in memory. How should you create a dataset following Google-recommended best practices?
A. Create a tf.data.Dataset.prefetch transformation.
B. Convert the images to tf.Tensor objects, and then run Dataset.from_tensor_slices().
C. Convert the images to tf.Tensor objects, and then run tf.data.Dataset.from_tensors().
D. Convert the images into TFRecords, store the images in Cloud Storage, and then use the tf.data API to read the images for training.

A

D

Because the tf.data API allows large data that doesn’t fit into memory

Choice A is wrong because prefetch helps speed up the data load

Choice B, C are wrong because it still loads the input data into memory

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

You are an ML engineer at a large grocery retailer with stores in multiple regions. You have been asked to create an inventory prediction model. Your model’s features include region, location, historical demand, and seasonal popularity. You want the algorithm to learn from new inventory data on a daily basis. Which algorithms should you use to build the model?
A. Classification
B. Reinforcement Learning
C. Recurrent Neural Networks (RNN)
D. Convolutional Neural Networks (CNN)

A

C

Because RNNs are designed to interpret temporal or sequential information. Good for prediction, sentence autocorrect.

Choice B is wrong because it works for an agent to learn within an environment through actions

Choice D is wrong because CNNs employ filters within convolutional layers to transform data. It is good for computer vision, pattern recognition

17
Q

You are building a real-time prediction engine that streams files which may contain Personally Identifiable Information (PII) to Google Cloud. You want to use the
Cloud Data Loss Prevention (DLP) API to scan the files. How should you ensure that the PII is not accessible by unauthorized individuals?
A. Stream all files to Google Cloud, and then write the data to BigQuery. Periodically conduct a bulk scan of the table using the DLP API.
B. Stream all files to Google Cloud, and write batches of the data to BigQuery. While the data is being written to BigQuery, conduct a bulk scan of the data using the DLP API.
C. Create two buckets of data: Sensitive and Non-sensitive. Write all data to the Non-sensitive bucket. Periodically conduct a bulk scan of that bucket using the DLP API, and move the sensitive data to the Sensitive bucket.
D. Create three buckets of data: Quarantine, Sensitive, and Non-sensitive. Write all data to the Quarantine bucket. Periodically conduct a bulk scan of that bucket using the DLP API, and move the data to either the Sensitive or Non-Sensitive bucket.

A

D

Choice A, B, C are wrong because between the bulk scan of the table, someone can still access PII data

18
Q

You work for a large hotel chain and have been asked to assist the marketing team in gathering predictions for a targeted marketing strategy. You need to make predictions about user lifetime value (LTV) over the next 20 days so that marketing can be adjusted accordingly. The customer dataset is in BigQuery, and you are preparing the tabular data for training with AutoML Tables. This data has a time signal that is spread across multiple columns. How should you ensure that
AutoML fits the best model to your data?
A. Manually combine all columns that contain a time signal into an array. AIlow AutoML to interpret this array appropriately. Choose an automatic data split across the training, validation, and testing sets.
B. Submit the data for training without performing any manual transformations. AIlow AutoML to handle the appropriate transformations. Choose an automatic data split across the training, validation, and testing sets.
C. Submit the data for training without performing any manual transformations, and indicate an appropriate column as the Time column. AIlow AutoML to split your data based on the time signal provided, and reserve the more recent data for the validation and testing sets.
D. Submit the data for training without performing any manual transformations. Use the columns that have a time signal to manually split your data. Ensure that the data in your validation set is from 30 days after the data in your training set and that the data in your testing sets from 30 days after your validation set.

A

D

As time signal that is spread across multiple columns so manual split is required.

Choice A is incorrect because AutoML doesn’t know about the requirements of LTV for the next 20 days

19
Q

You have written unit tests for a Kubeflow Pipeline that require custom libraries. You want to automate the execution of unit tests with each new push to your development branch in Cloud Source Repositories. What should you do?
A. Write a script that sequentially performs the push to your development branch and executes the unit tests on Cloud Run.
B. Using Cloud Build, set an automated trigger to execute the unit tests when changes are pushed to your development branch.
C. Set up a Cloud Logging sink to a Pub/Sub topic that captures interactions with Cloud Source Repositories. Configure a Pub/Sub trigger for Cloud Run, and execute the unit tests on Cloud Run.
D. Set up a Cloud Logging sink to a Pub/Sub topic that captures interactions with Cloud Source Repositories. Execute the unit tests using a Cloud Function that is triggered when messages are sent to the Pub/Sub topic.

A

B

Because Cloud Build can link directly to Git push/commit

20
Q

You are training an LSTM-based model on AI Platform to summarize text using the following job submission script: gcloud ai-platform jobs submit training $JOB_NAME \
–package-path $TRAINER_PACKAGE_PATH \
–module-name $MAIN_TRAINER_MODULE \
–job-dir $JOB_DIR \
–region $REGION \
–scale-tier basic \
– \
–epochs 20 \
–batch_size=32 \
–learning_rate=0.001 \
You want to ensure that training time is minimized without significantly compromising the accuracy of your model. What should you do?
A. Modify the ‘˜epochs’ parameter.
B. Modify the ‘˜scale-tier’ parameter.
C. Modify the ‘˜batch size’ parameter.
D. Modify the ‘˜learning rate’ parameter.

A

B

Because scale-tier determine the underlying resources for the training job. This will help to reduce the training time w/o compromising the accuracy
https://cloud.google.com/ai-platform/training/docs/reference/rest/v1/projects.jobs#scaletier

Choice A is wrong. 1 epoch is 1 complete pass of all training data. More epoch increases accuracy (up to some point) and increases training time.

Choice C is wrong. Batch size is a number of training samples pass before the model params updated. Increasing batch size can reduce training time, but could impact accuracy

Choice D is wrong. Same as batch size.

21
Q

You have deployed multiple versions of an image classification model on AI Platform. You want to monitor the performance of the model versions over time. How should you perform this comparison?
A. Compare the loss performance for each model on a held-out dataset.
B. Compare the loss performance for each model on the validation data.
C. Compare the receiver operating characteristic (ROC) curve for each model using the What-If Tool.
D. Compare the mean average precision across the models using the Continuous Evaluation feature.

A

D

https://cloud.google.com/ai-platform/prediction/docs/continuous-evaluation

Choice A, B, C is for training model only

22
Q

You trained a text classification model. You have the following SignatureDefs:

inputs[‘text’] shape (-1, 2)

You started a TensorFlow-serving component server and tried to send an HTTP request to get a prediction using: headers = {“content-type”: “application/json”} json_response = requests.post(‘http: //localhost:8501/v1/models/text_model:predict’, data=data, headers=headers)
What is the correct way to write the predict request?
A. data = json.dumps({‘signature_name’: ‘seving_default’, ‘instances’ [[‘˜ab’, ‘˜bc’, ‘˜cd’]]})
B. data = json.dumps({‘signature_name’: ‘serving_default’, ‘instances’ [[‘˜a’, ‘˜b’, ‘˜c’, ‘˜d’, ‘˜e’, ‘˜f’]]})
C. data = json.dumps({‘signature_name’: ‘serving_default’, ‘instances’ [[‘˜a’, ‘˜b’, ‘˜c’], [‘˜d’, ‘˜e’, ‘˜f’]]})
D. data = json.dumps({‘signature_name’: ‘serving_default’, ‘instances’ [[‘˜a’, ‘˜b’], [‘˜c’, ‘˜d’], [‘˜e’, ‘˜f’]]})

A

D

With input shape (-1, 2)
-1 means the array can auto expand to any length
2 means the inner array has length 2

23
Q

Your organization’s call center has asked you to develop a model that analyzes customer sentiments in each call. The call center receives over one million calls daily, and data is stored in Cloud Storage. The data collected must not leave the region in which the call originated, and no Personally Identifiable Information (PII) can be stored or analyzed. The data science team has a third-party tool for visualization and access which requires a SQL ANSI-2011 compliant interface. You need to select components for data processing and for analytics. How should the data pipeline be designed?

A. 1= Dataflow, 2= BigQuery
B. 1 = Pub/Sub, 2= Datastore
C. 1 = Dataflow, 2 = Cloud SQL
D. 1 = Cloud Function, 2= Cloud SQL

A

A

Dataflow for data processing
BigQuery for analytics and satisfy SQL ANSI-2011 compliant

24
Q

You are an ML engineer at a global shoe store. You manage the ML models for the company’s website. You are asked to build a model that will recommend new products to the user based on their purchase behavior and similarity with other users. What should you do?
A. Build a classification model
B. Build a knowledge-based filtering model
C. Build a collaborative-based filtering model
D. Build a regression model using the features as predictors

A

C

https://developers.google.com/machine-learning/recommendation/overview/candidate-generation
collaborative filtering: Uses similarities between queries and items simultaneously to provide recommendations.

Choice B is incorrect because content-based filtering: Uses similarity between items to recommend items similar to what the user likes.

25
Q

You work for a social media company. You need to detect whether posted images contain cars. Each training example is a member of exactly one class. You have trained an object detection neural network and deployed the model version to AI Platform Prediction for evaluation. Before deployment, you created an evaluation job and attached it to the AI Platform Prediction model version. You notice that the precision is lower than your business requirements allow. How should you adjust the model’s final layer softmax threshold to increase precision?
A. Increase the recall.
B. Decrease the recall.
C. Increase the number of false positives.
D. Decrease the number of false negatives.

A

B

Because precision and recall are in tension

26
Q

You are responsible for building a unified analytics environment across a variety of on-premises data marts. Your company is experiencing data quality and security challenges when integrating data across the servers, caused by the use of a wide range of disconnected tools and temporary solutions. You need a fully managed, cloud-native data integration service that will lower the total cost of work and reduce repetitive work. Some members on your team prefer a codeless interface for building Extract, Transform, Load (ETL) process. Which service should you use?
A. Dataflow
B. Dataprep
C. Apache Flink
D. Cloud Data Fusion

A

D

https://cloud.google.com/data-fusion?hl=en
Code less ETL platform

27
Q

You are an ML engineer at a regulated insurance company. You are asked to develop an insurance approval model that accepts or rejects insurance applications from potential customers. What factors should you consider before building the model?
A. Redaction, reproducibility, and explainability
B. Traceability, reproducibility, and explainability
C. Federated learning, reproducibility, and explainability
D. Differential privacy, federated learning, and explainability

A

B

Require for insurance audit purposes

28
Q

You are training a Resnet model on AI Platform using TPUs to visually categorize types of defects in automobile engines. You capture the training profile using the
Cloud TPU profiler plugin and observe that it is highly input-bound. You want to reduce the bottleneck and speed up your model training process. Which modifications should you make to the tf.data dataset? (Choose two.)
A. Use the interleave option for reading data.
B. Reduce the value of the repeat parameter.
C. Increase the buffer size for the shuttle option.
D. Set the prefetch option equal to the training batch size.
E. Decrease the batch size argument in your transformation.

A

A and D

interleave and prefetch could help improve training time

E is wrong because decrease batch size will increase training time

29
Q

You have trained a model on a dataset that required computationally expensive preprocessing operations. You need to execute the same preprocessing at prediction time. You deployed the model on AI Platform for high-throughput online prediction. Which architecture should you use?
A. Validate the accuracy of the model that you trained on preprocessed data. Create a new model that uses the raw data and is available in real time. Deploy the new model onto AI Platform for online prediction.
B. Send incoming prediction requests to a Pub/Sub topic. Transform the incoming data using a Dataflow job. Submit a prediction request to AI Platform using the transformed data. Write the predictions to an outbound Pub/Sub queue.
C. Stream incoming prediction request data into Cloud Spanner. Create a view to abstract your preprocessing logic. Query the view every second for new records. Submit a prediction request to AI Platform using the transformed data. Write the predictions to an outbound Pub/Sub queue.
D. Send incoming prediction requests to a Pub/Sub topic. Set up a Cloud Function that is triggered when messages are published to the Pub/Sub topic. Implement your preprocessing logic in the Cloud Function. Submit a prediction request to AI Platform using the transformed data. Write the predictions to an outbound Pub/Sub queue.

A

B

Because Dataflow is good at data transformation

A is wrong because the preprocess is expensive
D is wrong because Cloud Function is no different than the model running the preprocessing directly

30
Q

Your team trained and tested a DNN regression model with good results. Six months after deployment, the model is performing poorly due to a change in the distribution of the input data. How should you address the input differences in production?
A. Create alerts to monitor for skew, and retrain the model.
B. Perform feature selection on the model, and retrain the model with fewer features.
C. Retrain the model, and select an L2 regularization parameter with a hyperparameter tuning service.
D. Perform feature selection on the model, and retrain the model on a monthly basis with fewer features.

A

A

B, C, D are wrong because they cannot foresee the change in distribution of future input data

31
Q

You need to train a computer vision model that predicts the type of government ID present in a given image using a GPU-powered virtual machine on Compute
Engine. You use the following parameters:
✑ Optimizer: SGD
✑ Image shape = 224ֳ—224
✑ Batch size = 64
✑ Epochs = 10
✑ Verbose =2
During training you encounter the following error: ResourceExhaustedError: Out Of Memory (OOM) when allocating tensor. What should you do?
A. Change the optimizer.
B. Reduce the batch size.
C. Change the learning rate.
D. Reduce the image shape.

A

B

Reduce batch size will reduce the memory

32
Q

You developed an ML model with AI Platform, and you want to move it to production. You serve a few thousand queries per second and are experiencing latency issues. Incoming requests are served by a load balancer that distributes them across multiple Kubeflow CPU-only pods running on Google Kubernetes Engine
(GKE). Your goal is to improve the serving latency without changing the underlying infrastructure. What should you do?
A. Significantly increase the max_batch_size TensorFlow Serving parameter.
B. Switch to the tensorflow-model-server-universal version of TensorFlow Serving.
C. Significantly increase the max_enqueued_batches TensorFlow Serving parameter.
D. Recompile TensorFlow Serving using the source to support CPU-specific optimizations. Instruct GKE to choose an appropriate baseline minimum CPU platform for serving nodes.

A

A

https://github.com/tensorflow/serving/blob/master/tensorflow_serving/batching/README.md#cpu-only-one-approach

If your system is CPU-only (no GPU), then consider starting with the following values: num_batch_threads equal to the number of CPU cores; max_batch_size to a really high value; batch_timeout_micros to 0. Then experiment with batch_timeout_micros values in the 1-10 millisecond (1000-10000 microsecond) range, while keeping in mind that 0 may be the optimal value.