Google Cloud Professional ML Engineer

Practice Exam Questions & Study Guide

1

Architecting Low-Code AI Solutions

Build AI solutions using BigQuery ML, ML APIs, foundational models, and AutoML;

Section Overview

1.1: Developing ML Models by Using BigQuery ML

3 Question Sets Available

This section explores the practical application of BigQuery ML for solving real-world business problems. You learn to identify the right BigQuery ML model for different tasks, including linear and binary classification, regression, time series analysis, and more. We delve into feature engineering techniques within BigQuery ML to optimize model accuracy. Finally, you learn how to evaluate model performance by analyzing key metrics like R-squared, precision, recall, and F1-score, and generate both batch and online predictions using your trained models.

Focus Areas:
  • Building the appropriate BigQuery ML model (e.g., linear and binary classification, regression, time-series, matrix factorization, boosted trees, autoencoders) based on the business problem
  • Feature engineering or selection using BigQuery ML
  • Generating predictions by using BigQuery ML

1.2: Building AI Solutions by Using ML APIs
or Foundational Models

2 Question Sets Available

This section provides a practical exploration of building AI-powered applications using pre-trained models and APIs available on Google Cloud. It describes how to select the appropriate Model Garden API for tasks like image classification and language translation, and then integrate it into your application. It also covers using industry-specific APIs for specialized tasks like document processing and retail recommendations. Finally, in this section you gain experience building Retrieval Augmented Generation (RAG) applications with Vertex AI Agent Builder, to leverage external knowledge sources for more comprehensive and informed AI solutions.

Focus Areas:
  • Building applications by using ML APIs (e.g., Cloud Vision API, Natural Language API, Cloud Speech API, Translation) from Model Garden
  • Building applications by using industry-specific APIs (e.g., Document AI API, Retail API)
  • Implementing retrieval augmented generation (RAG) applications with Vertex AI Agent Builder by leveraging pre-built components and minimal coding for faster development, or utilizing visual, no-code tools without writing any code

1.3: Training Models by Using AutoML

Sets Available

This section focuses on preparing your data for use with AutoML in Vertex AI. It describes how to organize various data types, including tabular, text, images, and videos, for optimal model training. The section also covers data management techniques within Vertex AI, preprocessing steps using tools like Dataflow and BigQuery, and the creation of feature stores. Additionally, it explains the crucial role of feature selection and data labeling in AutoML and explores responsible AI practices by examining privacy implications and how to handle sensitive data.

Focus Areas:
  • Preparing data for AutoML (e.g., feature selection, data labeling, Tabular Workflows on AutoML)
  • Using available data (e.g., tabular, text, speech, images, videos) to train custom models
  • Using AutoML for tabular data
  • Creating forecasting models using AutoML
  • Configuring and debugging trained models
2

Collaborating to Manage Data and Models

Explore and preprocess data, prototype with Jupyter notebooks, track ML experiments

Section Overview

2.1: Exploring and Preprocessing Organization-Wide Data

Sets Available

This section covers the crucial steps in preparing and managing your data for machine learning tasks on Google Cloud. It describes how to choose the most suitable storage service for different data types and volumes, considering factors like cost and access patterns. This section explores data preprocessing techniques using tools like Dataflow, TFX, and BigQuery, covering essential steps such as data cleaning, transformation, and feature engineering. Finally, the section emphasizes responsible AI practices by highlighting the importance of data privacy and security, particularly when dealing with sensitive information. It also explains anonymization techniques and Google Cloud tools that help ensure compliance with privacy regulations.

Focus Areas:
  • Organizing different types of data (e.g., tabular, text, speech, images, videos) for efficient training
  • Managing datasets in Vertex AI
  • Data preprocessing (e.g., Dataflow, TensorFlow Extended [TFX], BigQuery)
  • Creating and consolidating features in Vertex AI Feature Store
  • Privacy implications of data usage and/or collection (e.g., handling sensitive data such as personally identifiable information [PII] and protected health information [PHI])
  • Ingesting different data sources (e.g., text documents) into Vertex AI for inference

2.2: Model Prototyping Using Jupyter Notebooks

Coming Soon

This section explores setting up and managing your machine learning development environment in Google Cloud. It explains the different Jupyter backend options, such as Vertex AI Workbench and Dataproc, and describes how to choose the best one for your needs. It covers essential security best practices in Vertex AI Workbench and Colab Enterprise to ensure your data and code remain protected. It also describes the advantages of using Spark kernels for large-scale data processing and how to integrate your notebooks with code repositories like Git for efficient version control and collaboration.

Focus Areas:
  • Choosing the appropriate Jupyter backend on Google Cloud (e.g., Vertex AI Workbench, notebooks on Dataproc)
  • Applying security best practices in Vertex AI Workbench and Colab Enterprise
  • Using Spark kernels
  • Integration with code source repositories
  • Developing models in Vertex AI Workbench by using common frameworks (e.g., TensorFlow, PyTorch, Scikit-learn, Spark, JAX)
  • Leveraging a variety of foundational and open-source models in Model Garden

2.3: Tracking and Running ML Experiments

Coming Soon

This section focuses on building and evaluating machine learning models with a particular emphasis on generative AI. It explains how to select the optimal Google Cloud environment for your development and experimentation needs, choosing from options like Vertex AI Experiments, Kubeflow Pipelines, and Vertex AI TensorBoard. It delves into the nuances of evaluating generative AI solutions, considering factors like accuracy, creativity, bias, and ethical implications. It also gives practical experience integrating Vertex AI TensorBoard with popular frameworks like TensorFlow and PyTorch, enabling you to effectively visualize and analyze model performance, identify potential bottlenecks, and optimize your models for better results.

Focus Areas:
  • Choosing the appropriate Google Cloud environment for development and experimentation (e.g., Vertex AI Experiments, Kubeflow Pipelines, Vertex AI TensorBoard with TensorFlow and PyTorch) given the framework
  • Evaluating generative AI solutions
3

Scaling Prototypes into ML Models

Build models, train at scale, and choose appropriate hardware for production

Section Overview

3.1: Building Models

Coming Soon

This section delves into the critical considerations for selecting the right tools and techniques for building interpretable machine learning models. It explains how to choose the most suitable ML framework for your project, considering factors like model development, training, and deployment. It also explores various modeling techniques and discusses how interpretability requirements can influence your choices, highlighting the trade-offs between model complexity and explainability.

Focus Areas:
  • Choosing ML framework and model architecture
  • Modeling techniques given interpretability requirements

3.2: Training Models

Coming Soon

This section provides a comprehensive guide to training machine learning models on Google Cloud. It explains how to organize and ingest various data types for training, utilize different SDKs like Vertex AI and Kubeflow, and implement distributed training for reliable pipelines. It covers crucial aspects of the training process, including hyperparameter tuning and troubleshooting common training failures. Finally, it explores techniques for fine-tuning foundational models from Model Garden using Vertex AI, enabling you to leverage pre-trained models for your specific needs.

Focus Areas:
  • Organizing training data (e.g., tabular, text, speech, images, videos) on Google Cloud (e.g., Cloud Storage, BigQuery)
  • Ingestion of various file types (e.g., CSV, JSON, images, Hadoop, databases) into training
  • Training using different SDKs (e.g., Vertex AI custom training, Kubeflow on Google Kubernetes Engine, AutoML, tabular workflows)
  • Using distributed training to organize reliable pipelines
  • Hyperparameter tuning
  • Troubleshooting ML model training failures
  • Fine-tuning foundational models (e.g., Vertex AI, Model Garden)

3.3: Choosing Appropriate Hardware for Training

Coming Soon

This section focuses on optimizing your model training process through strategic hardware and infrastructure choices. It describes the diverse compute and accelerator options available on Google Cloud, including CPUs, GPUs, TPUs, and edge devices, and how to select the best fit for your model's needs. It delves into distributed training techniques using TPUs and GPUs, exploring tools like Reduction Server on Vertex AI and Horovod. The section also provides a comparative analysis of GPUs and TPUs, helping you understand their trade-offs and make informed decisions based on your model architecture, computational demands, and budget constraints.

Focus Areas:
  • Evaluation of compute and accelerator options (e.g., CPU, GPU, TPU, edge devices)
  • Distributed training with TPUs and GPUs (e.g., Reduction Server on Vertex AI, Horovod)
4

Serving and Scaling Models

Deploy models for batch and online inference, and scale model serving in production

Section Overview

4.1: Serving Models

Coming Soon

This section explores the process of deploying and managing machine learning models for inference. It explains batch and online inference methods, comparing their strengths and weaknesses, and how to choose the right Google Cloud service for your needs, including Vertex AI, Dataflow, BigQuery ML, and Dataproc. It examines factors to consider when selecting hardware for low-latency predictions and explores options for serving models built with different frameworks, like PyTorch and XGBoost. Finally, the section covers the importance of organizing models within a model registry for version control and streamlined deployment management.

Focus Areas:
  • Batch and online inference (e.g., Vertex AI, Dataflow, BigQuery ML, Dataproc)
  • Using different frameworks (e.g., PyTorch, XGBoost) to serve models
  • Organizing a model registry
  • A/B testing different versions of a model

4.2: Scaling Online Model Serving

Coming Soon

This section delves into optimizing the performance and scalability of your deployed machine learning models. It describes how to leverage Vertex AI Feature Store for efficient feature access during online prediction requests and how to choose between public and private endpoints for secure model serving. It explores strategies for scaling your serving backend to handle increased traffic, including Vertex AI Prediction and containerized serving. This section also covers selecting appropriate hardware for serving, considering factors like model complexity and latency requirements. Finally, it covers techniques for tuning your models to optimize performance in a production environment, focusing on aspects like simplification, latency reduction, and memory optimization.

Focus Areas:
  • Vertex AI Feature Store
  • Vertex AI public and private endpoints
  • Choosing appropriate hardware (e.g., CPU, GPU, TPU, edge)
  • Scaling the serving backend based on the throughput (e.g., Vertex AI Prediction, containerized serving)
  • Tuning ML models for training and serving in production (e.g., simplification techniques, optimizing the ML solution for increased performance, latency, memory, throughput)
5

Automating and Orchestrating ML Pipelines

Develop end-to-end pipelines, automate retraining, and track metadata

Section Overview

5.1: Developing End-to-End ML Pipelines

Coming Soon

This section explores building and managing robust machine learning pipelines on Google Cloud. It explains the crucial role of data and model validation in ensuring reliable ML solutions and how to maintain consistent data preprocessing between training and serving stages. It delves into different orchestration frameworks like Kubeflow Pipelines, Vertex AI Pipelines, and Cloud Composer, comparing their strengths and weaknesses. This section also examines the advantages and challenges of hybrid and multi-cloud strategies for ML pipelines, providing a comprehensive view of building and deploying ML solutions in diverse environments.

Focus Areas:
  • Data and model validation
  • Ensuring consistent data pre-processing between training and serving
  • Hosting third-party pipelines on Google Cloud (e.g., MLFlow)
  • Identifying components, parameters, triggers, and compute needs (e.g., Cloud Build, Cloud Run)
  • Orchestration framework (e.g., Kubeflow Pipelines, Vertex AI Pipelines, Cloud Composer)
  • Hybrid or multi cloud strategies
  • System design with TFX components or Kubeflow DSL (e.g., Dataflow)

5.2: Automating Model Retraining

Coming Soon

This section focuses on automating the crucial process of retraining your machine learning models to maintain their accuracy and effectiveness over time. It explains how to establish a robust retraining policy, considering factors that influence retraining frequency and triggers. It also explores the advantages of implementing continuous integration and continuous delivery (CI/CD) pipelines for automated model deployment, ensuring a streamlined workflow for building, testing, and deploying updated models. This section will equip you with the knowledge to keep your ML models performing optimally in a dynamic environment through automation.

Focus Areas:
  • Determining an appropriate retraining policy
  • Continuous integration and continuous delivery (CI/CD) model deployment (e.g., Cloud Build, Jenkins)

5.3: Tracking and Auditing Metadata

Coming Soon

This section focuses on tracking and auditing metadata within your machine learning pipelines for improved transparency and reproducibility. It describes how to track and compare model artifacts and versions using tools like Vertex AI Experiments and Vertex ML Metadata. It explores techniques for implementing model and dataset versioning, ensuring reproducibility and change tracking. Finally, this section covers the concept of model and data lineage, highlighting its importance in understanding and auditing ML pipelines, and how to effectively track lineage using Google Cloud tools.

Focus Areas:
  • Tracking and comparing model artifacts and versions (e.g., Vertex AI Experiments, Vertex ML Metadata)
  • Hooking into model and dataset versioning
  • Model and data lineage
6

Monitoring ML Solutions

Identify risks, monitor performance, and test and troubleshoot ML solutions

Section Overview

6.1: Identifying Risks to ML Solutions

Coming Soon

This section focuses on identifying and mitigating potential risks associated with machine learning solutions. It covers security risks, including unintentional data exploitation and hacking, and how to address them through access control, encryption, and model hardening. It explores Google's Responsible AI practices, emphasizing fairness, privacy, transparency, and accountability in ML development. This section also covers identifying and mitigating biases in data, algorithms, and evaluations, and how to assess the overall readiness of your ML solution for production. Finally, it describes model explainability and how to leverage Vertex AI Explainable AI for gaining insights into model predictions and identifying potential biases or errors.

Focus Areas:
  • Building secure ML systems (e.g., protecting against unintentional exploitation of data or models, hacking)
  • Aligning with Google's Responsible AI practices (e.g., biases)
  • Assessing ML solution readiness (e.g., data bias, fairness)
  • Model explainability on Vertex AI (e.g., Vertex AI Prediction)

6.2: Monitoring, Testing, and Troubleshooting ML Solutions

Coming Soon

This section focuses on monitoring, testing, and troubleshooting your deployed machine learning solutions to ensure ongoing performance and reliability. It explains how to establish continuous evaluation metrics using tools like Vertex AI Model Monitoring and Explainable AI to track model performance and identify potential issues. It explores concepts like training-serving skew and feature attribution drift, understanding their causes and mitigation strategies. This section also covers monitoring model performance against baselines and simpler models, and across time, to detect performance degradation or overfitting. Finally, it explores common training and serving errors and how to troubleshoot them effectively using various techniques, like log analysis and debugging tools.

Focus Areas:
  • Establishing continuous evaluation metrics (e.g., Vertex AI Model Monitoring, Explainable AI)
  • Monitoring for training-serving skew
  • Monitoring for feature attribution drift
  • Monitoring model performance against baselines, simpler models, and across the time dimension
  • Common training and serving errors