Course Summary

Professional Cloud DevOps Engineer
Professional Cloud DevOps Engineers implement processes throughout the systems development lifecycle using Google-recommended methodologies and tools. They build and deploy software and infrastructure delivery pipelines, optimize and maintain production systems and services, and balance service reliability with delivery speed.

The Professional Cloud DevOps Engineer exam assesses your ability to:

Bootstrap a Google Cloud organization for DevOps
Build and implement CI/CD pipelines for a service
Apply site reliability engineering practices to a service
Implement service monitoring strategies
Optimize service performance

Section 1: Bootstrapping a Google Cloud organization for DevOps (~17% of the exam)

1.1 Designing the overall resource hierarchy for an organization. Considerations include:

● Projects and folders

● Shared networking

● Identity and Access Management (IAM) roles and organization-level policies

● Creating and managing service accounts

1.2 Managing infrastructure as code. Considerations include:

● Infrastructure as code tooling (e.g., Cloud Foundation Toolkit, Config Connector, Terraform, Helm)

● Making infrastructure changes using Google-recommended practices and infrastructure as code blueprints

● Immutable architecture

1.3 Designing a CI/CD architecture stack in Google Cloud, hybrid, and multi-cloud environments. Considerations include:

● CI with Cloud Build

● CD with Google Cloud Deploy

● Widely used third-party tooling (e.g., Jenkins, Git, ArgoCD, Packer)

● Security of CI/CD tooling

1.4 Managing multiple environments (e.g., staging, production). Considerations include:

● Determining the number of environments and their purpose

● Creating environments dynamically for each feature branch with Google Kubernetes Engine (GKE) and Terraform

● Config Management

Section 2: Building and implementing CI/CD pipelines for a service (~23% of the exam)

2.1 Designing and managing CI/CD pipelines. Considerations include:

● Artifact management with Artifact Registry

● Deployment to hybrid and multi-cloud environments (e.g., Anthos, GKE)

● CI/CD pipeline triggers

● Testing a new application version in the pipeline

● Configuring deployment processes (e.g., approval flows)

● CI/CD of serverless applications

2.2 Implementing CI/CD pipelines. Considerations include:

● Auditing and tracking deployments (e.g., Artifact Registry, Cloud Build, Google Cloud Deploy, Cloud Audit Logs)

● Deployment strategies (e.g., canary, blue/green, rolling, traffic splitting)

● Rollback strategies

● Troubleshooting deployment issues

2.3 Managing CI/CD configuration and secrets. Considerations include:

● Secure storage methods and key rotation services (e.g., Cloud Key Management Service, Secret Manager)

● Secret management

● Build versus runtime secret injection

2.4 Securing the CI/CD deployment pipeline. Considerations include:

● Vulnerability analysis with Artifact Registry

● Binary Authorization

● IAM policies per environment

Section 3: Applying site reliability engineering practices to a service (~23% of the exam)

3.1 Balancing change, velocity, and reliability of the service. Considerations include:

● Discovering SLIs (e.g., availability, latency)

● Defining SLOs and understanding SLAs

● Error budgets

● Toil automation

● Opportunity cost of risk and reliability (e.g., number of “nines”)

3.2 Managing service lifecycle. Considerations include:

● Service management (e.g., introduction of a new service by using a pre-service onboarding checklist, launch plan, or deployment plan, deployment, maintenance, and retirement)

● Capacity planning (e.g., quotas and limits management)

● Autoscaling using managed instance groups, Cloud Run, Cloud Functions, or GKE

● Implementing feedback loops to improve a service

3.3 Ensuring healthy communication and collaboration for operations. Considerations include:

● Preventing burnout (e.g., setting up automation processes to prevent burnout)

● Fostering a culture of learning and blamelessness

● Establishing joint ownership of services to eliminate team silos

3.4 Mitigating incident impact on users. Considerations include:

● Communicating during an incident

● Draining/redirecting traffic

● Adding capacity

3.5 Conducting a postmortem. Considerations include:

● Documenting root causes

● Creating and prioritizing action items

● Communicating the postmortem to stakeholders

Section 4: Implementing service monitoring strategies (~21% of the exam)

4.1 Managing logs. Considerations include:

● Collecting structured and unstructured logs from Compute Engine, GKE, and serverless platforms using Cloud Logging

● Configuring the Cloud Logging agent

● Collecting logs from outside Google Cloud

● Sending application logs directly to the Cloud Logging API

● Log levels (e.g., info, error, debug, fatal)

● Optimizing logs (e.g., multiline logging, exceptions, size, cost)

4.2 Managing metrics with Cloud Monitoring. Considerations include:

● Collecting and analyzing application and platform metrics

● Collecting networking and service mesh metrics

● Using Metrics Explorer for ad hoc metric analysis

● Creating custom metrics from logs

4.3 Managing dashboards and alerts in Cloud Monitoring. Considerations include:

● Creating a monitoring dashboard

● Filtering and sharing dashboards

● Configuring alerting

● Defining alerting policies based on SLOs and SLIs

● Automating alerting policy definition using Terraform

● Using Google Cloud Managed Service for Prometheus to collect metrics and set up monitoring and alerting

4.4 Managing Cloud Logging platform. Considerations include:

● Enabling data access logs (e.g., Cloud Audit Logs)

● Enabling VPC Flow Logs

● Viewing logs in the Google Cloud console

● Using basic versus advanced log filters

● Logs exclusion versus logs export

● Project-level versus organization-level export

● Managing and viewing log exports

● Sending logs to an external logging platform

● Filtering and redacting sensitive data (e.g., personally identifiable information [PII], protected health information [PHI])

4.5 Implementing logging and monitoring access controls. Considerations include:

● Restricting access to audit logs and VPC Flow Logs with Cloud Logging

● Restricting export configuration with Cloud Logging

● Allowing metric and log writing with Cloud Monitoring

Section 5: Optimizing the service performance (~16% of the exam)

5.1 Identifying service performance issues. Considerations include:

● Using Google Cloud’s operations suite to identify cloud resource utilization

● Interpreting service mesh telemetry

● Troubleshooting issues with compute resources

● Troubleshooting deploy time and runtime issues with applications

● Troubleshooting network issues (e.g., VPC Flow Logs, firewall logs, latency, network details)

5.2 Implementing debugging tools in Google Cloud. Considerations include:

● Application instrumentation

● Cloud Logging

● Cloud Trace

● Error Reporting

● Cloud Profiler

● Cloud Monitoring

5.3 Optimizing resource utilization and costs. Considerations include:

● Preemptible/Spot virtual machines (VMs)

● Committed-use discounts (e.g., flexible, resource-based)

● Sustained-use discounts

● Network tiers

● Sizing recommendations

Prerequisites: None Recommended experience: 3+ years of industry experience including 1+ years designing and managing production systems using Google Cloud

About this certification exam Length: Two hours Registration fee: $200 (plus tax where applicable) Languages: English, Japanese Exam format: 50-60 multiple choice and multiple select questions Certification Renewal / Recertification: Candidates must recertify in order to maintain their certification status. Unless explicitly stated in the detailed exam descriptions, all Google Cloud certifications are valid for two years from the date of certification. Recertification is accomplished by retaking the exam during the recertification eligibility time period and achieving a passing score. You may attempt recertification starting 60 days prior to your certification expiration date.

Following your booking, a confirmation message will be sent to all participants, ensuring you're well-informed of your successful enrollment. Calendar placeholders will also be dispatched to assist you in scheduling your commitments around the course. Rest assured, all course materials and access to necessary labs or platforms will be provided no later than one week before the course begins, allowing you ample time to prepare and engage fully with the learning experience ahead.

Our comprehensive training package includes all the necessary materials and resources to facilitate a full learning experience. Enrollees will be provided with detailed course content, encompassing a wide array of topics to ensure a thorough understanding of the subject matter. Additionally, participants will receive a certificate of completion to recognize their dedication and hard work. It's important to note that while the course fee covers all training materials and experiences, the examination fee for certification is not included but can be purchased separately.

Questions About This Course?