New Year Sale Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: pass65

Professional-Machine-Learning-Engineer Google Professional Machine Learning Engineer Questions and Answers

Questions 4

You work for a magazine distributor and need to build a model that predicts which customers will renew their subscriptions for the upcoming year. Using your company’s historical data as your training set, you created a TensorFlow model and deployed it to AI Platform. You need to determine which customer attribute has the most predictive power for each prediction served by the model. What should you do?

Options:

A.

Use AI Platform notebooks to perform a Lasso regression analysis on your model, which will eliminate features that do not provide a strong signal.

B.

Stream prediction results to BigQuery. Use BigQuery’s CORR(X1, X2) function to calculate the Pearson correlation coefficient between each feature and the target variable.

C.

Use the AI Explanations feature on AI Platform. Submit each prediction request with the ‘explain’ keyword to retrieve feature attributions using the sampled Shapley method.

D.

Use the What-If tool in Google Cloud to determine how your model will perform when individual features are excluded. Rank the feature importance in order of those that caused the most significant performance drop when removed from the model.

Buy Now
Questions 5

You have deployed multiple versions of an image classification model on Al Platform. You want to monitor the performance of the model versions overtime. How should you perform this comparison?

Options:

A.

Compare the loss performance for each model on a held-out dataset.

B.

Compare the loss performance for each model on the validation data

C.

Compare the receiver operating characteristic (ROC) curve for each model using the What-lf Tool

D.

Compare the mean average precision across the models using the Continuous Evaluation feature

Buy Now
Questions 6

You work for a food product company. Your company's historical sales data is stored in BigQuery You need to use Vertex Al’s custom training service to train multiple TensorFlow models that read the data from BigQuery and predict future sales You plan to implement a data preprocessing algorithm that performs min-max scaling and bucketing on a large number of features before you start experimenting with the models. You want to minimize preprocessing time, cost and development effort How should you configure this workflow?

Options:

A.

Write the transformations into Spark that uses the spark-bigquery-connector and use Dataproc to preprocess the data.

B.

Write SQL queries to transform the data in-place in BigQuery.

C.

Add the transformations as a preprocessing layer in the TensorFlow models.

D.

Create a Dataflow pipeline that uses the BigQuerylO connector to ingest the data process it and write it back to BigQuery.

Buy Now
Questions 7

You recently deployed a scikit-learn model to a Vertex Al endpoint You are now testing the model on live production traffic While monitoring the endpoint. you discover twice as many requests per hour than expected throughout the day You want the endpoint to efficiently scale when the demand increases in the future to prevent users from experiencing high latency What should you do?

Options:

A.

Deploy two models to the same endpoint and distribute requests among them evenly.

B.

Configure an appropriate minReplicaCount value based on expected baseline traffic.

C.

Set the target utilization percentage in the autcscalir.gMetricspecs configuration to a higher value

D.

Change the model's machine type to one that utilizes GPUs.

Buy Now
Questions 8

Your team has been tasked with creating an ML solution in Google Cloud to classify support requests for one of your platforms. You analyzed the requirements and decided to use TensorFlow to build the classifier so that you have full control of the model's code, serving, and deployment. You will use Kubeflow pipelines for the ML platform. To save time, you want to build on existing resources and use managed services instead of building a completely new model. How should you build the classifier?

Options:

A.

Use the Natural Language API to classify support requests

B.

Use AutoML Natural Language to build the support requests classifier

C.

Use an established text classification model on Al Platform to perform transfer learning

D.

Use an established text classification model on Al Platform as-is to classify support requests

Buy Now
Questions 9

You need to quickly build and train a model to predict the sentiment of customer reviews with custom categories without writing code. You do not have enough data to train a model from scratch. The resulting model should have high predictive performance. Which service should you use?

Options:

A.

AutoML Natural Language

B.

Cloud Natural Language API

C.

AI Hub pre-made Jupyter Notebooks

D.

AI Platform Training built-in algorithms

Buy Now
Questions 10

You are working on a binary classification ML algorithm that detects whether an image of a classified scanned document contains a company’s logo. In the dataset, 96% of examples don’t have the logo, so the dataset is very skewed. Which metrics would give you the most confidence in your model?

Options:

A.

F-score where recall is weighed more than precision

B.

RMSE

C.

F1 score

D.

F-score where precision is weighed more than recall

Buy Now
Questions 11

You have been asked to productionize a proof-of-concept ML model built using Keras. The model was trained in a Jupyter notebook on a data scientist’s local machine. The notebook contains a cell that performs data validation and a cell that performs model analysis. You need to orchestrate the steps contained in the notebook and automate the execution of these steps for weekly retraining. You expect much more training data in the future. You want your solution to take advantage of managed services while minimizing cost. What should you do?

Options:

A.

Move the Jupyter notebook to a Notebooks instance on the largest N2 machine type, and schedule the execution of the steps in the Notebooks instance using Cloud Scheduler.

B.

Write the code as a TensorFlow Extended (TFX) pipeline orchestrated with Vertex AI Pipelines. Use standard TFX components for data validation and model analysis, and use Vertex AI Pipelines for model retraining.

C.

Rewrite the steps in the Jupyter notebook as an Apache Spark job, and schedule the execution of the job on ephemeral Dataproc clusters using Cloud Scheduler.

D.

Extract the steps contained in the Jupyter notebook as Python scripts, wrap each script in an Apache Airflow BashOperator, and run the resulting directed acyclic graph (DAG) in Cloud Composer.

Buy Now
Questions 12

You work for a rapidly growing social media company. Your team builds TensorFlow recommender models in an on-premises CPU cluster. The data contains billions of historical user events and 100 000 categorical features. You notice that as the data increases the model training time increases. You plan to move the models to Google Cloud You want to use the most scalable approach that also minimizes training time. What should you do?

Options:

A.

Deploy the training jobs by using TPU VMs with TPUv3 Pod slices, and use the TPUEmbedding API.

B.

Deploy the training jobs in an autoscaling Google Kubernetes Engine cluster with CPUs

C.

Deploy a matrix factorization model training job by using BigQuery ML.

D.

Deploy the training jobs by using Compute Engine instances with A100 GPUs and use the

t f. nn. embedding_lookup API.

Buy Now
Questions 13

You developed a Vertex Al ML pipeline that consists of preprocessing and training steps and each set of steps runs on a separate custom Docker image Your organization uses GitHub and GitHub Actions as CI/CD to run unit and integration tests You need to automate the model retraining workflow so that it can be initiated both manually and when a new version of the code is merged in the main branch You want to minimize the steps required to build the workflow while also allowing for maximum flexibility How should you configure the CI/CD workflow?

Options:

A.

Trigger a Cloud Build workflow to run tests build custom Docker images, push the images to Artifact Registry and launch the pipeline in Vertex Al Pipelines.

B.

Trigger GitHub Actions to run the tests launch a job on Cloud Run to build custom Docker images push the images to Artifact Registry and launch the pipeline in Vertex Al Pipelines.

C.

Trigger GitHub Actions to run the tests build custom Docker images push the images to Artifact Registry, and launch the pipeline in Vertex Al Pipelines.

D.

Trigger GitHub Actions to run the tests launch a Cloud Build workflow to build custom Dicker images, push the images to Artifact Registry, and launch the pipeline in Vertex Al Pipelines.

Buy Now
Questions 14

You have been given a dataset with sales predictions based on your company’s marketing activities. The data is structured and stored in BigQuery, and has been carefully managed by a team of data analysts. You need to prepare a report providing insights into the predictive capabilities of the data. You were asked to run several ML models with different levels of sophistication, including simple models and multilayered neural networks. You only have a few hours to gather the results of your experiments. Which Google Cloud tools should you use to complete this task in the most efficient and self-serviced way?

Options:

A.

Use BigQuery ML to run several regression models, and analyze their performance.

B.

Read the data from BigQuery using Dataproc, and run several models using SparkML.

C.

Use Vertex AI Workbench user-managed notebooks with scikit-learn code for a variety of ML algorithms and performance metrics.

D.

Train a custom TensorFlow model with Vertex AI, reading the data from BigQuery featuring a variety of ML algorithms.

Buy Now
Questions 15

You are building a linear regression model on BigQuery ML to predict a customer's likelihood of purchasing your company's products. Your model uses a city name variable as a key predictive component. In order to train and serve the model, your data must be organized in columns. You want to prepare your data using the least amount of coding while maintaining the predictable variables. What should you do?

Options:

A.

Create a new view with BigQuery that does not include a column with city information

B.

Use Dataprep to transform the state column using a one-hot encoding method, and make each city a column with binary values.

C.

Use Cloud Data Fusion to assign each city to a region labeled as 1, 2, 3, 4, or 5r and then use that number to represent the city in the model.

D.

Use TensorFlow to create a categorical variable with a vocabulary list Create the vocabulary file, and upload it as part of your model to BigQuery ML.

Buy Now
Questions 16

You are an ML engineer at a regulated insurance company. You are asked to develop an insurance approval model that accepts or rejects insurance applications from potential customers. What factors should you consider before building the model?

Options:

A.

Redaction, reproducibility, and explainability

B.

Traceability, reproducibility, and explainability

C.

Federated learning, reproducibility, and explainability

D.

Differential privacy federated learning, and explainability

Buy Now
Questions 17

You work for an online publisher that delivers news articles to over 50 million readers. You have built an AI model that recommends content for the company’s weekly newsletter. A recommendation is considered successful if the article is opened within two days of the newsletter’s published date and the user remains on the page for at least one minute.

All the information needed to compute the success metric is available in BigQuery and is updated hourly. The model is trained on eight weeks of data, on average its performance degrades below the acceptable baseline after five weeks, and training time is 12 hours. You want to ensure that the model’s performance is above the acceptable baseline while minimizing cost. How should you monitor the model to determine when retraining is necessary?

Options:

A.

Use Vertex AI Model Monitoring to detect skew of the input features with a sample rate of 100% and a monitoring frequency of two days.

B.

Schedule a cron job in Cloud Tasks to retrain the model every week before the newsletter is created.

C.

Schedule a weekly query in BigQuery to compute the success metric.

D.

Schedule a daily Dataflow job in Cloud Composer to compute the success metric.

Buy Now
Questions 18

You are developing a mode! to detect fraudulent credit card transactions. You need to prioritize detection because missing even one fraudulent transaction could severely impact the credit card holder. You used AutoML to tram a model on users' profile information and credit card transaction data. After training the initial model, you notice that the model is failing to detect many fraudulent transactions. How should you adjust the training parameters in AutoML to improve model performance?

Choose 2 answers

Options:

A.

Increase the score threshold.

B.

Decrease the score threshold.

C.

Add more positive examples to the training set.

D.

Add more negative examples to the training set.

E.

Reduce the maximum number of node hours for training.

Buy Now
Questions 19

You work with a learn of researchers lo develop state-of-the-art algorithms for financial analysis. Your team develops and debugs complex models in TensorFlow. You want to maintain the ease of debugging while also reducing the model training time. How should you set up your training environment?

Options:

A.

Configure a v3-8 TPU VM.

B.

Configure a v3-8 TPU node.

C.

Configure a c2-standard-60 VM without GPUs.

D, Configure a n1-standard-4 VM with 1 NVIDIA P100 GPU.

Buy Now
Questions 20

You work for a multinational organization that has recently begun operations in Spain. Teams within your organization will need to work with various Spanish documents, such as business, legal, and financial documents. You want to use machine learning to help your organization get accurate translations quickly and with the least effort. Your organization does not require domain-specific terms or jargon. What should you do?

Options:

A.

Create a Vertex Al Workbench notebook instance. In the notebook, convert the Spanish documents into plain text, and create a custom TensorFlow seq2seq translation model.

B.

Create a Vertex Al Workbench notebook instance. In the notebook, extract sentences from the documents, and train a custom AutoML text model.

C.

Use Google Translate to translate 1.000 phrases from Spanish to English. Using these translated pairs, train a custom AutoML Translation model.

D.

Use the Document Translation feature of the Cloud Translation API to translate the documents.

Buy Now
Questions 21

You work on the data science team at a manufacturing company. You are reviewing the company's historical sales data, which has hundreds of millions of records. For your exploratory data analysis, you need to calculate descriptive statistics such as mean, median, and mode; conduct complex statistical tests for hypothesis testing; and plot variations of the features over time You want to use as much of the sales data as possible in your analyses while minimizing computational resources. What should you do?

Options:

A.

Spin up a Vertex Al Workbench user-managed notebooks instance and import the dataset Use this data to create statistical and visual analyses

B.

Visualize the time plots in Google Data Studio. Import the dataset into Vertex Al Workbench user-managed notebooks Use this data to calculate the descriptive statistics and run the statistical analyses

C.

Use BigQuery to calculate the descriptive statistics. Use Vertex Al Workbench user-managed notebooks to visualize the time plots and run the statistical analyses.

D Use BigQuery to calculate the descriptive statistics, and use Google Data Studio to visualize the time plots. Use Vertex Al Workbench user-managed notebooks to run the statistical analyses.

Buy Now
Questions 22

Your team trained and tested a DNN regression model with good results. Six months after deployment, the model is performing poorly due to a change in the distribution of the input data. How should you address the input differences in production?

Options:

A.

Create alerts to monitor for skew, and retrain the model.

B.

Perform feature selection on the model, and retrain the model with fewer features

C.

Retrain the model, and select an L2 regularization parameter with a hyperparameter tuning service

D.

Perform feature selection on the model, and retrain the model on a monthly basis with fewer features

Buy Now
Questions 23

You work for a hospital that wants to optimize how it schedules operations. You need to create a model that uses the relationship between the number of surgeries scheduled and beds used You want to predict how many beds will be needed for patients each day in advance based on the scheduled surgeries You have one year of data for the hospital organized in 365 rows

The data includes the following variables for each day

• Number of scheduled surgeries

• Number of beds occupied

• Date

You want to maximize the speed of model development and testing What should you do?

Options:

A.

Create a BigQuery table Use BigQuery ML to build a regression model, with number of beds as the target variable and number of scheduled surgeries and date features (such as day of week) as the predictors

B.

Create a BigQuery table Use BigQuery ML to build an ARIMA model, with number of beds as the target variable and date as the time variable.

C.

Create a Vertex Al tabular dataset Tram an AutoML regression model, with number of beds as the target variable and number of scheduled minor surgeries and date features (such as day of the week) as the predictors

D.

Create a Vertex Al tabular dataset Train a Vertex Al AutoML Forecasting model with number of beds as the target variable, number of scheduled surgeries as a covariate, and date as the time variable.

Buy Now
Questions 24

You are going to train a DNN regression model with Keras APIs using this code:

Professional-Machine-Learning-Engineer Question 24

How many trainable weights does your model have? (The arithmetic below is correct.)

Options:

A.

501*256+257*128+2 = 161154

B.

500*256+256*128+128*2 = 161024

C.

501*256+257*128+128*2=161408

D.

500*256*0 25+256*128*0 25+128*2 = 40448

Buy Now
Questions 25

You have developed a BigQuery ML model that predicts customer churn and deployed the model to Vertex Al Endpoints. You want to automate the retraining of your model by using minimal additional code when model feature values change. You also want to minimize the number of times that your model is retrained to reduce training costs. What should you do?

Options:

A.

1. Enable request-response logging on Vertex Al Endpoints.

2 Schedule a TensorFlow Data Validation job to monitor prediction drift

3. Execute model retraining if there is significant distance between the distributions.

B.

1. Enable request-response logging on Vertex Al Endpoints

2. Schedule a TensorFlow Data Validation job to monitor training/serving skew

3. Execute model retraining if there is significant distance between the distributions

C.

1 Create a Vertex Al Model Monitoring job configured to monitor prediction drift.

2. Configure alert monitoring to publish a message to a Pub/Sub queue when a monitonng alert is detected.

3. Use a Cloud Function to monitor the Pub/Sub queue, and trigger retraining in BigQuery

D.

1. Create a Vertex Al Model Monitoring job configured to monitor training/serving skew

2. Configure alert monitoring to publish a message to a Pub/Sub queue when a monitoring alert is detected

3. Use a Cloud Function to monitor the Pub/Sub queue, and trigger retraining in BigQuery.

Buy Now
Questions 26

You are implementing a batch inference ML pipeline in Google Cloud. The model was developed using TensorFlow and is stored in SavedModel format in Cloud Storage You need to apply the model to a historical dataset containing 10 TB of data that is stored in a BigQuery table How should you perform the inference?

Options:

A.

Export the historical data to Cloud Storage in Avro format. Configure a Vertex Al batch prediction job to generate predictions for the exported data.

B.

Import the TensorFlow model by using the create model statement in BigQuery ML Apply the historical data to the TensorFlow model.

C.

Export the historical data to Cloud Storage in CSV format Configure a Vertex Al batch prediction job to generate predictions for the exported data.

D.

Configure a Vertex Al batch prediction job to apply the model to the historical data in BigQuery

Buy Now
Questions 27

You work for a telecommunications company You're building a model to predict which customers may fail to pay their next phone bill. The purpose of this model is to proactively offer at-risk customers assistance such as service discounts and bill deadline extensions. The data is stored in BigQuery, and the predictive features that are available for model training include

- Customer_id -Age

- Salary (measured in local currency) -Sex

-Average bill value (measured in local currency)

- Number of phone calls in the last month (integer) -Average duration of phone calls (measured in minutes)

You need to investigate and mitigate potential bias against disadvantaged groups while preserving model accuracy What should you do?

Options:

A.

Determine whether there is a meaningful correlation between the sensitive features and the other features Train a BigQuery ML boosted trees classification model and exclude the sensitive features and any meaningfully correlated features

B.

Train a BigQuery ML boosted trees classification model with all features Use the ml. global explain method to calculate the global attribution values for each feature of the model If the feature importance value for any of the sensitive features exceeds a threshold, discard the model and tram without this feature

C.

Train a BigQuery ML boosted trees classification model with all features Use the ml. exflain_predict method to calculate the attribution values for each feature for each customer in a test set If for any individual customer the importance value for any feature exceeds a predefined threshold, discard the model and train the model again without this feature.

D.

Define a fairness metric that is represented by accuracy across the sensitive features Train a BigQuery ML boosted trees classification model with all features Use the trained model to make predictions on a test set Join the data back with the sensitive features, and calculate a fairness metric to investigate whether it meets your requirements.

Buy Now
Questions 28

Your task is classify if a company logo is present on an image. You found out that 96% of a data does not include a logo. You are dealing with data imbalance problem. Which metric do you use to evaluate to model?

Options:

A.

F1 Score

B.

RMSE

C.

F Score with higher precision weighting than recall

D.

F Score with higher recall weighted than precision

Buy Now
Questions 29

You have trained a model on a dataset that required computationally expensive preprocessing operations. You need to execute the same preprocessing at prediction time. You deployed the model on Al Platform for high-throughput online prediction. Which architecture should you use?

Options:

A.

• Validate the accuracy of the model that you trained on preprocessed data

• Create a new model that uses the raw data and is available in real time

• Deploy the new model onto Al Platform for online prediction

B.

• Send incoming prediction requests to a Pub/Sub topic

• Transform the incoming data using a Dataflow job

• Submit a prediction request to Al Platform using the transformed data

• Write the predictions to an outbound Pub/Sub queue

C.

• Stream incoming prediction request data into Cloud Spanner

• Create a view to abstract your preprocessing logic.

• Query the view every second for new records

• Submit a prediction request to Al Platform using the transformed data

• Write the predictions to an outbound Pub/Sub queue.

D.

• Send incoming prediction requests to a Pub/Sub topic

• Set up a Cloud Function that is triggered when messages are published to the Pub/Sub topic.

• Implement your preprocessing logic in the Cloud Function

• Submit a prediction request to Al Platform using the transformed data

• Write the predictions to an outbound Pub/Sub queue

Buy Now
Questions 30

You recently developed a deep learning model using Keras, and now you are experimenting with different training strategies. First, you trained the model using a single GPU, but the training process was too slow. Next, you distributed the training across 4 GPUs using tf.distribute.MirroredStrategy (with no other changes), but you did not observe a decrease in training time. What should you do?

Options:

A.

Distribute the dataset with tf.distribute.Strategy.experimental_distribute_dataset

B.

Create a custom training loop.

C.

Use a TPU with tf.distribute.TPUStrategy.

D.

Increase the batch size.

Buy Now
Questions 31

You are working on a classification problem with time series data and achieved an area under the receiver operating characteristic curve (AUC ROC) value of 99% for training data after just a few experiments. You haven’t explored using any sophisticated algorithms or spent any time on hyperparameter tuning. What should your next step be to identify and fix the problem?

Options:

A.

Address the model overfitting by using a less complex algorithm.

B.

Address data leakage by applying nested cross-validation during model training.

C.

Address data leakage by removing features highly correlated with the target value.

D.

Address the model overfitting by tuning the hyperparameters to reduce the AUC ROC value.

Buy Now
Questions 32

You work for a retailer that sells clothes to customers around the world. You have been tasked with ensuring that ML models are built in a secure manner. Specifically, you need to protect sensitive customer data that might be used in the models. You have identified four fields containing sensitive data that are being used by your data science team: AGE, IS_EXISTING_CUSTOMER, LATITUDE_LONGITUDE, and SHIRT_SIZE. What should you do with the data before it is made available to the data science team for training purposes?

Options:

A.

Tokenize all of the fields using hashed dummy values to replace the real values.

B.

Use principal component analysis (PCA) to reduce the four sensitive fields to one PCA vector.

C.

Coarsen the data by putting AGE into quantiles and rounding LATITUDE_LONGTTUDE into single precision. The other two fields are already as coarse as possible.

D.

Remove all sensitive data fields, and ask the data science team to build their models using non-sensitive data.

Buy Now
Questions 33

You work at a gaming startup that has several terabytes of structured data in Cloud Storage. This data includes gameplay time data user metadata and game metadata. You want to build a model that recommends new games to users that requires the least amount of coding. What should you do?

Options:

A.

Load the data in BigQuery Use BigQuery ML to tram an Autoencoder model.

B.

Load the data in BigQuery Use BigQuery ML to train a matrix factorization model.

C.

Read data to a Vertex Al Workbench notebook Use TensorFlow to train a two-tower model.

D.

Read data to a Vertex AI Workbench notebook Use TensorFlow to train a matrix factorization model.

Buy Now
Questions 34

You are building an ML model to predict trends in the stock market based on a wide range of factors. While exploring the data, you notice that some features have a large range. You want to ensure that the features with the largest magnitude don’t overfit the model. What should you do?

Options:

A.

Standardize the data by transforming it with a logarithmic function.

B.

Apply a principal component analysis (PCA) to minimize the effect of any particular feature.

C.

Use a binning strategy to replace the magnitude of each feature with the appropriate bin number.

D.

Normalize the data by scaling it to have values between 0 and 1.

Buy Now
Questions 35

You need to design a customized deep neural network in Keras that will predict customer purchases based on their purchase history. You want to explore model performance using multiple model architectures, store training data, and be able to compare the evaluation metrics in the same dashboard. What should you do?

Options:

A.

Create multiple models using AutoML Tables

B.

Automate multiple training runs using Cloud Composer

C.

Run multiple training jobs on Al Platform with similar job names

D.

Create an experiment in Kubeflow Pipelines to organize multiple runs

Buy Now
Questions 36

You work for a social media company. You need to detect whether posted images contain cars. Each training example is a member of exactly one class. You have trained an object detection neural network and deployed the model version to Al Platform Prediction for evaluation. Before deployment, you created an evaluation job and attached it to the Al Platform Prediction model version. You notice that the precision is lower than your business requirements allow. How should you adjust the model's final layer softmax threshold to increase precision?

Options:

A.

Increase the recall

B.

Decrease the recall.

C.

Increase the number of false positives

D.

Decrease the number of false negatives

Buy Now
Questions 37

You work at an organization that maintains a cloud-based communication platform that integrates conventional chat, voice, and video conferencing into one platform. The audio recordings are stored in Cloud Storage. All recordings have an 8 kHz sample rate and are more than one minute long. You need to implement a new feature in the platform that will automatically transcribe voice call recordings into a text for future applications, such as call summarization and sentiment analysis. How should you implement the voice call transcription feature following Google-recommended best practices?

Options:

A.

Use the original audio sampling rate, and transcribe the audio by using the Speech-to-Text API with synchronous recognition.

B.

Use the original audio sampling rate, and transcribe the audio by using the Speech-to-Text API with asynchronous recognition.

C.

Upsample the audio recordings to 16 kHz. and transcribe the audio by using the Speech-to-Text API with synchronous recognition.

D.

Upsample the audio recordings to 16 kHz. and transcribe the audio by using the Speech-to-Text API with asynchronous recognition.

Buy Now
Questions 38

You have built a model that is trained on data stored in Parquet files. You access the data through a Hive table hosted on Google Cloud. You preprocessed these data with PySpark and exported it as a CSV file into Cloud Storage. After preprocessing, you execute additional steps to train and evaluate your model. You want to parametrize this model training in Kubeflow Pipelines. What should you do?

Options:

A.

Remove the data transformation step from your pipeline.

B.

Containerize the PySpark transformation step, and add it to your pipeline.

C.

Add a ContainerOp to your pipeline that spins a Dataproc cluster, runs a transformation, and then saves the transformed data in Cloud Storage.

D.

Deploy Apache Spark at a separate node pool in a Google Kubernetes Engine cluster. Add a ContainerOp to your pipeline that invokes a corresponding transformation job for this Spark instance.

Buy Now
Questions 39

You work for a credit card company and have been asked to create a custom fraud detection model based on historical data using AutoML Tables. You need to prioritize detection of fraudulent transactions while minimizing false positives. Which optimization objective should you use when training the model?

Options:

A.

An optimization objective that minimizes Log loss

B.

An optimization objective that maximizes the Precision at a Recall value of 0.50

C.

An optimization objective that maximizes the area under the precision-recall curve (AUC PR) value

D.

An optimization objective that maximizes the area under the receiver operating characteristic curve (AUC ROC) value

Buy Now
Questions 40

You are an ML engineer in the contact center of a large enterprise. You need to build a sentiment analysis tool that predicts customer sentiment from recorded phone conversations. You need to identify the best approach to building a model while ensuring that the gender, age, and cultural differences of the customers who called the contact center do not impact any stage of the model development pipeline and results. What should you do?

Options:

A.

Extract sentiment directly from the voice recordings

B.

Convert the speech to text and build a model based on the words

C.

Convert the speech to text and extract sentiments based on the sentences

D.

Convert the speech to text and extract sentiment using syntactical analysis

Buy Now
Questions 41

You recently deployed a model to a Vertex Al endpoint Your data drifts frequently so you have enabled request-response logging and created a Vertex Al Model Monitoring job. You have observed that your model is receiving higher traffic than expected. You need to reduce the model monitoring cost while continuing to quickly detect drift. What should you do?

Options:

A.

Replace the monitoring job with a DataFlow pipeline that uses TensorFlow Data Validation (TFDV).

B.

Replace the monitoring job with a custom SQL scnpt to calculate statistics on the features and predictions in BigQuery.

C.

Decrease the sample_rate parameter in the Randomsampleconfig of the monitoring job.

D.

Increase the monitor_interval parameter in the scheduieconfig of the monitoring job.

Buy Now
Questions 42

You work for a gaming company that has millions of customers around the world. All games offer a chat feature that allows players to communicate with each other in real time. Messages can be typed in more than 20 languages and are translated in real time using the Cloud Translation API. You have been asked to build an ML system to moderate the chat in real time while assuring that the performance is uniform across the various languages and without changing the serving infrastructure.

You trained your first model using an in-house word2vec model for embedding the chat messages translated by the Cloud Translation API. However, the model has significant differences in performance across the different languages. How should you improve it?

Options:

A.

Add a regularization term such as the Min-Diff algorithm to the loss function.

B.

Train a classifier using the chat messages in their original language.

C.

Replace the in-house word2vec with GPT-3 or T5.

D.

Remove moderation for languages for which the false positive rate is too high.

Buy Now
Questions 43

You lead a data science team at a large international corporation. Most of the models your team trains are large-scale models using high-level TensorFlow APIs on AI Platform with GPUs. Your team usually

takes a few weeks or months to iterate on a new version of a model. You were recently asked to review your team’s spending. How should you reduce your Google Cloud compute costs without impacting the model’s performance?

Options:

A.

Use AI Platform to run distributed training jobs with checkpoints.

B.

Use AI Platform to run distributed training jobs without checkpoints.

C.

Migrate to training with Kuberflow on Google Kubernetes Engine, and use preemptible VMs with checkpoints.

D.

Migrate to training with Kuberflow on Google Kubernetes Engine, and use preemptible VMs without checkpoints.

Buy Now
Questions 44

You are working on a system log anomaly detection model for a cybersecurity organization. You have developed the model using TensorFlow, and you plan to use it for real-time prediction. You need to create a Dataflow pipeline to ingest data via Pub/Sub and write the results to BigQuery. You want to minimize the serving latency as much as possible. What should you do?

Options:

A.

Containerize the model prediction logic in Cloud Run, which is invoked by Dataflow.

B.

Load the model directly into the Dataflow job as a dependency, and use it for prediction.

C.

Deploy the model to a Vertex AI endpoint, and invoke this endpoint in the Dataflow job.

D.

Deploy the model in a TFServing container on Google Kubernetes Engine, and invoke it in the Dataflow job.

Buy Now
Questions 45

Your organization’s marketing team is building a customer recommendation chatbot that uses a generative AI large language model (LLM) to provide personalized product suggestions in real time. The chatbot needs to access data from millions of customers, including purchase history, browsing behavior, and preferences. The data is stored in a Cloud SQL for PostgreSQL database. You need the chatbot response time to be less than 100ms. How should you design the system?

Options:

A.

Use BigQuery ML to fine-tune the LLM with the data in the Cloud SQL for PostgreSQL database, and access the model from BigQuery.

B.

Replicate the Cloud SQL for PostgreSQL database to AlloyDB. Configure the chatbot server to query AlloyDB.

C.

Transform relevant customer data into vector embeddings and store them in Vertex AI Search for retrieval by the LLM.

D.

Create a caching layer between the chatbot and the Cloud SQL for PostgreSQL database to store frequently accessed customer data. Configure the chatbot server to query the cache.

Buy Now
Questions 46

You need to deploy a scikit-learn classification model to production. The model must be able to serve requests 24/7 and you expect millions of requests per second to the production application from 8 am to 7 pm. You need to minimize the cost of deployment What should you do?

Options:

A.

Deploy an online Vertex Al prediction endpoint Set the max replica count to 1

B.

Deploy an online Vertex Al prediction endpoint Set the max replica count to 100

C.

Deploy an online Vertex Al prediction endpoint with one GPU per replica Set the max replica count to 1.

D.

Deploy an online Vertex Al prediction endpoint with one GPU per replica Set the max replica count to 100.

Buy Now
Questions 47

You are pre-training a large language model on Google Cloud. This model includes custom TensorFlow operations in the training loop Model training will use a large batch size, and you expect training to take several weeks You need to configure a training architecture that minimizes both training time and compute costs What should you do?

Options:

A.

B.

C.

47

D.

Buy Now
Questions 48

You are using Keras and TensorFlow to develop a fraud detection model Records of customer transactions are stored in a large table in BigQuery. You need to preprocess these records in a cost-effective and efficient way before you use them to train the model. The trained model will be used to perform batch inference in BigQuery. How should you implement the preprocessing workflow?

Options:

A.

Implement a preprocessing pipeline by using Apache Spark, and run the pipeline on Dataproc Save the preprocessed data as CSV files in a Cloud Storage bucket.

B.

Load the data into a pandas DataFrame Implement the preprocessing steps using panda’s transformations. and train the model directly on the DataFrame.

C.

Perform preprocessing in BigQuery by using SQL Use the BigQueryClient in TensorFlow to read the data directly from BigQuery.

D.

Implement a preprocessing pipeline by using Apache Beam, and run the pipeline on Dataflow Save the preprocessed data as CSV files in a Cloud Storage bucket.

Buy Now
Questions 49

You have recently developed a custom model for image classification by using a neural network. You need to automatically identify the values for learning rate, number of layers, and kernel size. To do this, you plan to run multiple jobs in parallel to identify the parameters that optimize performance. You want to minimize custom code development and infrastructure management. What should you do?

Options:

A.

Create a Vertex Al pipeline that runs different model training jobs in parallel.

B.

Train an AutoML image classification model.

C.

Create a custom training job that uses the Vertex Al Vizier SDK for parameter optimization.

D.

Create a Vertex Al hyperparameter tuning job.

Buy Now
Questions 50

You work on a growing team of more than 50 data scientists who all use AI Platform. You are designing a strategy to organize your jobs, models, and versions in a clean and scalable way. Which strategy should you choose?

Options:

A.

Set up restrictive IAM permissions on the AI Platform notebooks so that only a single user or group can access a given instance.

B.

Separate each data scientist’s work into a different project to ensure that the jobs, models, and versions created by each data scientist are accessible only to that user.

C.

Use labels to organize resources into descriptive categories. Apply a label to each created resource so that users can filter the results by label when viewing or monitoring the resources.

D.

Set up a BigQuery sink for Cloud Logging logs that is appropriately filtered to capture information about AI Platform resource usage. In BigQuery, create a SQL view that maps users to the resources they are using

Buy Now
Questions 51

You work at a bank You have a custom tabular ML model that was provided by the bank's vendor. The training data is not available due to its sensitivity. The model is packaged as a Vertex Al Model serving container which accepts a string as input for each prediction instance. In each string the feature values are separated by commas. You want to deploy this model to production for online predictions, and monitor the feature distribution over time with minimal effort What should you do?

Options:

A.

1 Upload the model to Vertex Al Model Registry and deploy the model to a Vertex Ai endpoint.

2. Create a Vertex Al Model Monitoring job with feature drift detection as the monitoring objective, and provide an instance schema.

B.

1 Upload the model to Vertex Al Model Registry and deploy the model to a Vertex Al endpoint.

2 Create a Vertex Al Model Monitoring job with feature skew detection as the monitoring objective and provide an instance schema.

C.

1 Refactor the serving container to accept key-value pairs as input format.

2. Upload the model to Vertex Al Model Registry and deploy the model to a Vertex Al endpoint.

3. Create a Vertex Al Model Monitoring job with feature drift detection as the monitoring objective.

D.

1 Refactor the serving container to accept key-value pairs as input format.

2 Upload the model to Vertex Al Model Registry and deploy the model to a Vertex Al endpoint.

3. Create a Vertex Al Model Monitoring job with feature skew detection as the monitoring objective.

Buy Now
Questions 52

You have a demand forecasting pipeline in production that uses Dataflow to preprocess raw data prior to model training and prediction. During preprocessing, you employ Z-score normalization on data stored in BigQuery and write it back to BigQuery. New training data is added every week. You want to make the process more efficient by minimizing computation time and manual intervention. What should you do?

Options:

A.

Normalize the data using Google Kubernetes Engine

B.

Translate the normalization algorithm into SQL for use with BigQuery

C.

Use the normalizer_fn argument in TensorFlow's Feature Column API

D.

Normalize the data with Apache Spark using the Dataproc connector for BigQuery

Buy Now
Questions 53

Your data science team has requested a system that supports scheduled model retraining, Docker containers, and a service that supports autoscaling and monitoring for online prediction requests. Which platform components should you choose for this system?

Options:

A.

Vertex AI Pipelines and App Engine

B.

Vertex AI Pipelines, Vertex AI Prediction, and Vertex AI Model Monitoring

C.

Cloud Composer, BigQuery ML, and Vertex AI Prediction

D.

Cloud Composer, Vertex AI Training with custom containers, and App Engine

Buy Now
Questions 54

You are using Kubeflow Pipelines to develop an end-to-end PyTorch-based MLOps pipeline. The pipeline reads data from BigQuery,

processes the data, conducts feature engineering, model training, model evaluation, and deploys the model as a binary file to Cloud Storage. You are

writing code for several different versions of the feature engineering and model training steps, and running each new version in Vertex Al Pipelines.

Each pipeline run is taking over an hour to complete. You want to speed up the pipeline execution to reduce your development time, and you want to

avoid additional costs. What should you do?

Options:

A.

Delegate feature engineering to BigQuery and remove it from the pipeline.

B.

Add a GPU to the model training step.

C.

Enable caching in all the steps of the Kubeflow pipeline.

D.

Comment out the part of the pipeline that you are not currently updating.

Buy Now
Questions 55

You manage a team of data scientists who use a cloud-based backend system to submit training jobs. This system has become very difficult to administer, and you want to use a managed service instead. The data scientists you work with use many different frameworks, including Keras, PyTorch, theano, scikit-learn, and custom libraries. What should you do?

Options:

A.

Use the Vertex AI Training to submit training jobs using any framework.

B.

Configure Kubeflow to run on Google Kubernetes Engine and submit training jobs through TFJob.

C.

Create a library of VM images on Compute Engine, and publish these images on a centralized repository.

D.

Set up Slurm workload manager to receive jobs that can be scheduled to run on your cloud infrastructure.

Buy Now
Questions 56

You are building a custom image classification model and plan to use Vertex Al Pipelines to implement the end-to-end training. Your dataset consists of images that need to be preprocessed before they can be used to train the model. The preprocessing steps include resizing the images, converting them to grayscale, and extracting features. You have already implemented some Python functions for the preprocessing tasks. Which components should you use in your pipeline'?

Options:

A.

B.

C.

56

D.

56

Buy Now
Questions 57

You have deployed a scikit-learn model to a Vertex Al endpoint using a custom model server. You enabled auto scaling; however, the deployed model fails to scale beyond one replica, which led to dropped requests. You notice that CPU utilization remains low even during periods of high load. What should you do?

Options:

A.

Attach a GPU to the prediction nodes.

B.

Increase the number of workers in your model server.

C.

Schedule scaling of the nodes to match expected demand.

D.

Increase the minReplicaCount in your DeployedModel configuration.

Buy Now
Questions 58

You are developing an ML model using a dataset with categorical input variables. You have randomly split half of the data into training and test sets. After applying one-hot encoding on the categorical variables in the training set, you discover that one categorical variable is missing from the test set. What should you do?

Options:

A.

Randomly redistribute the data, with 70% for the training set and 30% for the test set

B.

Use sparse representation in the test set

C.

Apply one-hot encoding on the categorical variables in the test data.

D.

Collect more data representing all categories

Buy Now
Questions 59

You trained a text classification model. You have the following SignatureDefs:

Professional-Machine-Learning-Engineer Question 59

What is the correct way to write the predict request?

Options:

A.

data = json.dumps({"signature_name": "serving_default'\ "instances": [fab', 'be1, 'cd']]})

B.

data = json dumps({"signature_name": "serving_default"! "instances": [['a', 'b', "c", 'd', 'e', 'f']]})

C.

data = json.dumps({"signature_name": "serving_default, "instances": [['a', 'b\ 'c'1, [d\ 'e\ T]]})

D.

data = json dumps({"signature_name": f,serving_default", "instances": [['a', 'b'], [c\ 'd'], ['e\ T]]})

Buy Now
Questions 60

You want to train an AutoML model to predict house prices by using a small public dataset stored in BigQuery. You need to prepare the data and want to use the simplest most efficient approach. What should you do?

Options:

A.

Write a query that preprocesses the data by using BigQuery and creates a new table Create a Vertex Al managed dataset with the new table as the data source.

B.

Use Dataflow to preprocess the data Write the output in TFRecord format to a Cloud Storage bucket.

C.

Write a query that preprocesses the data by using BigQuery Export the query results as CSV files and use

those files to create a Vertex Al managed dataset.

D.

Use a Vertex Al Workbench notebook instance to preprocess the data by using the pandas library Export the data as CSV files, and use those files to create a Vertex Al managed dataset.

Buy Now
Questions 61

You work for an organization that operates a streaming music service. You have a custom production model that is serving a "next song" recommendation based on a user’s recent listening history. Your model is deployed on a Vertex Al endpoint. You recently retrained the same model by using fresh data. The model received positive test results offline. You now want to test the new model in production while minimizing complexity. What should you do?

Options:

A.

Create a new Vertex Al endpoint for the new model and deploy the new model to that new endpoint Build a service to randomly send 5% of production traffic to the new endpoint Monitor end-user metrics such as listening time If end-user metrics improve between models over time gradually increase the percentage of production traffic sent to the new endpoint.

B.

Capture incoming prediction requests in BigQuery Create an experiment in Vertex Al Experiments Run batch predictions for both models using the captured data Use the user's selected song to compare the models performance side by side If the new models performance metrics are better than the previous model deploy the new model to production.

C.

Deploy the new model to the existing Vertex Al endpoint Use traffic splitting to send 5% of production traffic to the new model Monitor end-user metrics, such as listening time If end-user metrics improve between models over time, gradually increase the percentage of production traffic sent to the new model.

D.

Configure a model monitoring job for the existing Vertex Al endpoint. Configure the monitoring job to detect prediction drift, and set a threshold for alerts Update the model on the endpoint from the previous model to the new model If you receive an alert of prediction drift, revert to the previous model.

Buy Now
Questions 62

You have developed an application that uses a chain of multiple scikit-learn models to predict the optimal price for your company's products. The workflow logic is shown in the diagram Members of your team use the individual models in other solution workflows. You want to deploy this workflow while ensuring version control for each individual model and the overall workflow Your application needs to be able to scale down to zero. You want to minimize the compute resource utilization and the manual effort required to manage this solution. What should you do?

Options:

A.

Expose each individual model as an endpoint in Vertex Al Endpoints. Create a custom container endpoint to orchestrate the workflow.

B.

Create a custom container endpoint for the workflow that loads each models individual files Track the versions of each individual model in BigQuery.

C.

Expose each individual model as an endpoint in Vertex Al Endpoints. Use Cloud Run to orchestrate the workflow.

D.

Load each model's individual files into Cloud Run Use Cloud Run to orchestrate the workflow Track the versions of each individual model in BigQuery.

Buy Now
Questions 63

Your team is training a large number of ML models that use different algorithms, parameters and datasets. Some models are trained in Vertex Ai Pipelines, and some are trained on Vertex Al Workbench notebook instances. Your team wants to compare the performance of the models across both services. You want to minimize the effort required to store the parameters and metrics What should you do?

Options:

A.

Implement an additional step for all the models running in pipelines and notebooks to export parameters and metrics to BigQuery.

B.

Create a Vertex Al experiment Submit all the pipelines as experiment runs. For models trained on notebooks log parameters and metrics by using the Vertex Al SDK.

C.

Implement all models in Vertex Al Pipelines Create a Vertex Al experiment, and associate all pipeline runs with that experiment.

D.

Store all model parameters and metrics as mode! metadata by using the Vertex Al Metadata API.

Buy Now
Questions 64

You have trained a DNN regressor with TensorFlow to predict housing prices using a set of predictive features. Your default precision is tf.float64, and you use a standard TensorFlow estimator;

estimator = tf.estimator.DNNRegressor(

feature_columns=[YOUR_LIST_OF_FEATURES],

hidden_units-[1024, 512, 256],

dropout=None)

Your model performs well, but Just before deploying it to production, you discover that your current serving latency is 10ms @ 90 percentile and you currently serve on CPUs. Your production requirements expect a model latency of 8ms @ 90 percentile. You are willing to accept a small decrease in performance in order to reach the latency requirement Therefore your plan is to improve latency while evaluating how much the model's prediction decreases. What should you first try to quickly lower the serving latency?

Options:

A.

Increase the dropout rate to 0.8 in_PREDICT mode by adjusting the TensorFlow Serving parameters

B.

Increase the dropout rate to 0.8 and retrain your model.

C.

Switch from CPU to GPU serving

D.

Apply quantization to your SavedModel by reducing the floating point precision to tf.float16.

Buy Now
Questions 65

You are an AI engineer working for a popular video streaming platform. You built a classification model using PyTorch to predict customer churn. Each week, the customer retention team plans to contact customers identified as at-risk for churning with personalized offers. You want to deploy the model while minimizing maintenance effort. What should you do?

Options:

A.

Use Vertex AI’s prebuilt containers for prediction. Deploy the container on Cloud Run to generate online predictions.

B.

Use Vertex AI’s prebuilt containers for prediction. Deploy the model on Google Kubernetes Engine (GKE), and configure the model for batch prediction.

C.

Deploy the model to a Vertex AI endpoint, and configure the model for batch prediction. Schedule the batch prediction to run weekly.

D.

Deploy the model to a Vertex AI endpoint, and configure the model for online prediction. Schedule a job to query this endpoint weekly.

Buy Now
Questions 66

You have been asked to build a model using a dataset that is stored in a medium-sized (~10 GB) BigQuery table. You need to quickly determine whether this data is suitable for model development. You want to create a one-time report that includes both informative visualizations of data distributions and more sophisticated statistical analyses to share with other ML engineers on your team. You require maximum flexibility to create your report. What should you do?

Options:

A.

Use Vertex AI Workbench user-managed notebooks to generate the report.

B.

Use the Google Data Studio to create the report.

C.

Use the output from TensorFlow Data Validation on Dataflow to generate the report.

D.

Use Dataprep to create the report.

Buy Now
Questions 67

You are training models in Vertex Al by using data that spans across multiple Google Cloud Projects You need to find track, and compare the performance of the different versions of your models Which Google Cloud services should you include in your ML workflow?

Options:

A.

Dataplex. Vertex Al Feature Store and Vertex Al TensorBoard

B.

Vertex Al Pipelines, Vertex Al Feature Store, and Vertex Al Experiments

C.

Dataplex. Vertex Al Experiments, and Vertex Al ML Metadata

D.

Vertex Al Pipelines: Vertex Al Experiments and Vertex Al Metadata

Buy Now
Questions 68

You are the lead ML engineer on a mission-critical project that involves analyzing massive datasets using Apache Spark. You need to establish a robust environment that allows your team to rapidly prototype Spark models using Jupyter notebooks. What is the fastest way to achieve this?

Options:

A.

Configure a Compute Engine instance with Spark and use Jupyter notebooks.

B.

Set up a Dataproc cluster with Spark and use Jupyter notebooks.

C.

Set up a Vertex AI Workbench instance with a Spark kernel.

D.

Use Colab Enterprise with a Spark kernel.

Buy Now
Questions 69

You work for a hotel and have a dataset that contains customers' written comments scanned from paper-based customer feedback forms which are stored as PDF files Every form has the same layout. You need to quickly predict an overall satisfaction score from the customer comments on each form. How should you accomplish this task'?

Options:

A.

Use the Vision API to parse the text from each PDF file Use the Natural Language API

analyzesentiment feature to infer overall satisfaction scores.

B.

Use the Vision API to parse the text from each PDF file Use the Natural Language API

analyzeEntitysentiment feature to infer overall satisfaction scores.

C.

Uptrain a Document Al custom extractor to parse the text in the comments section of each PDF file. Use the Natural Language API analyze sentiment feature to infer overall satisfaction scores.

D.

Uptrain a Document Al custom extractor to parse the text in the comments section of each PDF file. Use the Natural Language API analyzeEntitySentiment feature to infer overall satisfaction scores.

Buy Now
Questions 70

Your team has a model deployed to a Vertex Al endpoint You have created a Vertex Al pipeline that automates the model training process and is triggered by a Cloud Function. You need to prioritize keeping the model up-to-date, but also minimize retraining costs. How should you configure retraining'?

Options:

A.

Configure Pub/Sub to call the Cloud Function when a sufficient amount of new data becomes available.

B.

Configure a Cloud Scheduler job that calls the Cloud Function at a predetermined frequency that fits your team's budget.

C.

Enable model monitoring on the Vertex Al endpoint Configure Pub/Sub to call the Cloud Function when anomalies are detected.

D.

Enable model monitoring on the Vertex Al endpoint Configure Pub/Sub to call the Cloud Function when feature drift is detected.

Buy Now
Questions 71

While performing exploratory data analysis on a dataset, you find that an important categorical feature has 5% null values. You want to minimize the bias that could result from the missing values. How should you handle the missing values?

Options:

A.

Remove the rows with missing values, and upsample your dataset by 5%.

B.

Replace the missing values with the feature’s mean.

C.

Replace the missing values with a placeholder category indicating a missing value.

D.

Move the rows with missing values to your validation dataset.

Buy Now
Questions 72

You developed a Vertex Al pipeline that trains a classification model on data stored in a large BigQuery table. The pipeline has four steps, where each step is created by a Python function that uses the KubeFlow v2 API The components have the following names:

You launch your Vertex Al pipeline as the following:

You perform many model iterations by adjusting the code and parameters of the training step. You observe high costs associated with the development, particularly the data export and preprocessing steps. You need to reduce model development costs. What should you do?

Options:

A.

B.

C.

D.

Buy Now
Questions 73

You work for a public transportation company and need to build a model to estimate delay times for multiple transportation routes. Predictions are served directly to users in an app in real time. Because different seasons and population increases impact the data relevance, you will retrain the model every month. You want to follow Google-recommended best practices. How should you configure the end-to-end architecture of the predictive model?

Options:

A.

Configure Kubeflow Pipelines to schedule your multi-step workflow from training to deploying your model.

B.

Use a model trained and deployed on BigQuery ML and trigger retraining with the scheduled query feature in BigQuery

C.

Write a Cloud Functions script that launches a training and deploying job on Ai Platform that is triggered by Cloud Scheduler

D.

Use Cloud Composer to programmatically schedule a Dataflow job that executes the workflow from training to deploying your model

Buy Now
Questions 74

You are developing a training pipeline for a new XGBoost classification model based on tabular data The data is stored in a BigQuery table You need to complete the following steps

1. Randomly split the data into training and evaluation datasets in a 65/35 ratio

2. Conduct feature engineering

3 Obtain metrics for the evaluation dataset.

4 Compare models trained in different pipeline executions

How should you execute these steps'?

Options:

A.

1 Using Vertex Al Pipelines, add a component to divide the data into training and evaluation sets, and add another component for feature engineering

2. Enable auto logging of metrics in the training component.

3 Compare pipeline runs in Vertex Al Experiments

B.

1 Using Vertex Al Pipelines, add a component to divide the data into training and evaluation sets, and add another component for feature engineering

2 Enable autologging of metrics in the training component

3 Compare models using the artifacts lineage in Vertex ML Metadata

C.

1 In BigQuery ML. use the create model statement with bocstzd_tree_classifier as the model

type and use BigQuery to handle the data splits.

2 Use a SQL view to apply feature engineering and train the model using the data in that view

3. Compare the evaluation metrics of the models by using a SQL query with the ml. training_infc statement.

D.

1 In BigQuery ML use the create model statement with boosted_tree_classifier as the model

type, and use BigQuery to handle the data splits.

2 Use ml transform to specify the feature engineering transformations, and train the model using the

data in the table

' 3. Compare the evaluation metrics of the models by using a SQL query with the ml. training_info statement.

Buy Now
Questions 75

You work for an online grocery store. You recently developed a custom ML model that recommends a recipe when a user arrives at the website. You chose the machine type on the Vertex Al endpoint to optimize costs by using the queries per second (QPS) that the model can serve, and you deployed it on a single machine with 8 vCPUs and no accelerators.

A holiday season is approaching and you anticipate four times more traffic during this time than the typical daily traffic You need to ensure that the model can scale efficiently to the increased demand. What should you do?

Options:

A.

1, Maintain the same machine type on the endpoint.

2 Set up a monitoring job and an alert for CPU usage

3 If you receive an alert add a compute node to the endpoint

B.

1 Change the machine type on the endpoint to have 32 vCPUs

2. Set up a monitoring job and an alert for CPU usage

3 If you receive an alert, scale the vCPUs further as needed

C.

1 Maintain the same machine type on the endpoint Configure the endpoint to enable autoscalling based on vCPU usage.

2 Set up a monitoring job and an alert for CPU usage

3 If you receive an alert investigate the cause

D.

1 Change the machine type on the endpoint to have a GPU_ Configure the endpoint to enable autoscaling based on the GPU usage.

2 Set up a monitoring job and an alert for GPU usage.

3 If you receive an alert investigate the cause.

Buy Now
Questions 76

As the lead ML Engineer for your company, you are responsible for building ML models to digitize scanned customer forms. You have developed a TensorFlow model that converts the scanned images into text and stores them in Cloud Storage. You need to use your ML model on the aggregated data collected at the end of each day with minimal manual intervention. What should you do?

Options:

A.

Use the batch prediction functionality of Al Platform

B.

Create a serving pipeline in Compute Engine for prediction

C.

Use Cloud Functions for prediction each time a new data point is ingested

D.

Deploy the model on Al Platform and create a version of it for online inference.

Buy Now
Questions 77

You work for a manufacturing company. You need to train a custom image classification model to detect product detects at the end of an assembly line. Although your model is performing well, some images in your holdout set are consistently mislabeled with high confidence. You want to use Vertex Al to understand your models results. What should you do?

Options:

A.

Configure feature-based explanations by using sampled Shapley. Set number of feature permutations to the maximum value of 50.

B.

Create an index by using Vertex Al Matching Engine. Query the index with your mislabeled images

C.

Configure example-based explanations by using integrated gradients. Set visualization type to pixels, and set clip_percent_upperbound to 95.

D.

Configure example-based explanations. Specify the embedding output layer to be used for the latent space representation.

Buy Now
Questions 78

You are experimenting with a built-in distributed XGBoost model in Vertex AI Workbench user-managed notebooks. You use BigQuery to split your data into training and validation sets using the following queries:

CREATE OR REPLACE TABLE ‘myproject.mydataset.training‘ AS

(SELECT * FROM ‘myproject.mydataset.mytable‘ WHERE RAND() <= 0.8);

CREATE OR REPLACE TABLE ‘myproject.mydataset.validation‘ AS

(SELECT * FROM ‘myproject.mydataset.mytable‘ WHERE RAND() <= 0.2);

After training the model, you achieve an area under the receiver operating characteristic curve (AUC ROC) value of 0.8, but after deploying the model to production, you notice that your model performance has dropped to an AUC ROC value of 0.65. What problem is most likely occurring?

Options:

A.

There is training-serving skew in your production environment.

B.

There is not a sufficient amount of training data.

C.

The tables that you created to hold your training and validation records share some records, and you may not be using all the data in your initial table.

D.

The RAND() function generated a number that is less than 0.2 in both instances, so every record in the validation table will also be in the training table.

Buy Now
Questions 79

You have recently trained a scikit-learn model that you plan to deploy on Vertex Al. This model will support both online and batch prediction. You need to preprocess input data for model inference. You want to package the model for deployment while minimizing additional code What should you do?

Options:

A.

1 Upload your model to the Vertex Al Model Registry by using a prebuilt scikit-learn prediction container

2 Deploy your model to Vertex Al Endpoints, and create a Vertex Al batch prediction job that uses the instanceConfig.inscanceType setting to transform your input data

B.

1 Wrap your model in a custom prediction routine (CPR). and build a container image from the CPR local model

2 Upload your sci-kit learn model container to Vertex Al Model Registry

3 Deploy your model to Vertex Al Endpoints, and create a Vertex Al batch prediction job

C.

1. Create a custom container for your sci-kit learn model,

2 Define a custom serving function for your model

3 Upload your model and custom container to Vertex Al Model Registry

4 Deploy your model to Vertex Al Endpoints, and create a Vertex Al batch prediction job

D.

1 Create a custom container for your sci-kit learn model.

2 Upload your model and custom container to Vertex Al Model Registry

3 Deploy your model to Vertex Al Endpoints, and create a Vertex Al batch prediction job that uses the instanceConfig. instanceType setting to transform your input data

Buy Now
Questions 80

You have a functioning end-to-end ML pipeline that involves tuning the hyperparameters of your ML model using Al Platform, and then using the best-tuned parameters for training. Hypertuning is taking longer than expected and is delaying the downstream processes. You want to speed up the tuning job without significantly compromising its effectiveness. Which actions should you take?

Choose 2 answers

Options:

A.

Decrease the number of parallel trials

B.

Decrease the range of floating-point values

C.

Set the early stopping parameter to TRUE

D.

Change the search algorithm from Bayesian search to random search.

E.

Decrease the maximum number of trials during subsequent training phases.

Buy Now
Questions 81

You need to develop an image classification model by using a large dataset that contains labeled images in a Cloud Storage Bucket. What should you do?

Options:

A.

Use Vertex Al Pipelines with the Kubeflow Pipelines SDK to create a pipeline that reads the images from Cloud Storage and trains the model.

B.

Use Vertex Al Pipelines with TensorFlow Extended (TFX) to create a pipeline that reads the images from Cloud Storage and trams the model.

C.

Import the labeled images as a managed dataset in Vertex Al: and use AutoML to tram the model.

D.

Convert the image dataset to a tabular format using Dataflow Load the data into BigQuery and use BigQuery ML to tram the model.

Buy Now
Questions 82

You are developing an ML model intended to classify whether X-Ray images indicate bone fracture risk. You have trained on Api Resnet architecture on Vertex AI using a TPU as an accelerator, however you are unsatisfied with the trainning time and use memory usage. You want to quickly iterate your training code but make minimal changes to the code. You also want to minimize impact on the models accuracy. What should you do?

Options:

A.

Configure your model to use bfloat16 instead float32

B.

Reduce the global batch size from 1024 to 256

C.

Reduce the number of layers in the model architecture

D.

Reduce the dimensions of the images used un the model

Buy Now
Questions 83

You have created a Vertex Al pipeline that includes two steps. The first step preprocesses 10 TB data completes in about 1 hour, and saves the result in a Cloud Storage bucket The second step uses the processed data to train a model You need to update the model's code to allow you to test different algorithms You want to reduce pipeline execution time and cost, while also minimizing pipeline changes What should you do?

Options:

A.

Add a pipeline parameter and an additional pipeline step Depending on the parameter value the pipeline step conducts or skips data preprocessing and starts model training.

B.

Create another pipeline without the preprocessing step, and hardcode the preprocessed Cloud Storage file location for model training.

C.

Configure a machine with more CPU and RAM from the compute-optimized machine family for the data preprocessing step.

D.

Enable caching for the pipeline job. and disable caching for the model training step.

Buy Now
Questions 84

You work for the AI team of an automobile company, and you are developing a visual defect detection model using TensorFlow and Keras. To improve your model performance, you want to incorporate some image augmentation functions such as translation, cropping, and contrast tweaking. You randomly apply these functions to each training batch. You want to optimize your data processing pipeline for run time and compute resources utilization. What should you do?

Options:

A.

Embed the augmentation functions dynamically in the tf.Data pipeline.

B.

Embed the augmentation functions dynamically as part of Keras generators.

C.

Use Dataflow to create all possible augmentations, and store them as TFRecords.

D.

Use Dataflow to create the augmentations dynamically per training run, and stage them as TFRecords.

Buy Now
Questions 85

You work as an analyst at a large banking firm. You are developing a robust, scalable ML pipeline to train several regression and classification models. Your primary focus for the pipeline is model interpretability. You want to productionize the pipeline as quickly as possible What should you do?

Options:

A.

Use Tabular Workflow for Wide & Deep through Vertex Al Pipelines to jointly train wide linear models and

deep neural networks.

B.

Use Google Kubernetes Engine to build a custom training pipeline for XGBoost-based models.

C.

Use Tabular Workflow forTabel through Vertex Al Pipelines to train attention-based models.

D.

Use Cloud Composer to build the training pipelines for custom deep learning-based models.

Buy Now
Exam Name: Google Professional Machine Learning Engineer
Last Update: Dec 28, 2024
Questions: 285

PDF + Testing Engine

$57.75  $164.99

Testing Engine

$43.75  $124.99
buy now Professional-Machine-Learning-Engineer testing engine

PDF (Q&A)

$36.75  $104.99
buy now Professional-Machine-Learning-Engineer pdf