Logging and Monitoring in Google Cloud

Overview

Class Central Tips

This course teaches participants techniques for monitoring and improving infrastructure and application performance in Google Cloud. Using a combination of presentations, demos, hands-on labs, and real-world case studies, attendees gain experience with full-stack monitoring, real-time log management and analysis, debugging code in production, tracing application performance bottlenecks, and profiling CPU and memory usage.

Syllabus

Introduction

Welcome to Logging and Monitoring in Google Cloud! We will cover the pre-requisites, audience and the course objectives.

Introduction to Google Cloud's Operations Suite

In this module, we will take some time to do a high-level overview of the various products which comprise Google Cloud’s logging, monitoring, and observability suite.

Monitoring Critical Systems

Monitoring is all about keeping track of exactly what's happening with the resources we've spun up inside of Google's Cloud. In this module, we'll take a look at options and best practices as they relate to monitoring project architectures. We'll differentiate the core Cloud IAM roles needed to decide who can do what as it relates to monitoring. Just like architecture, this is another crucial early step. We will examine some of the Google created default dashboards, and see how to use them appropriately. We will create charts and use them to build custom dashboards to show resource consumption and application load. And, finally, we will define uptime checks to track liveliness and latency.

Alerting Policies

Alerting gives timely awareness to problems in your cloud applications so you can resolve the problems quickly. In this module, you will learn how to develop alerting strategies, define alerting policies, add notification channels, identify types of alerts and common uses for each, construct and alert on resource groups, and manage alerting policies programmatically.

Advanced Logging and Analysis

In this module, we will examine some of Google Cloud's advanced logging and analysis capabilities. Specifically, in this module you will learn to identify and choose among resource tagging approaches, define log sinks, create monitoring metrics based on log entries, link application errors to Logging and other operation tools using Error Reporting, and export logs to BigQuery for long term storage and SQL based analysis.

Working with Audit Logs

In this module, we will examine how to use Cloud Audit Logs. You will learn how to use Cloud Audit Logs to answer the question, “Who, did what, and when?” We will also cover best practices for Audit Logging.