Community Blog
Get the latest updates on the Splunk Community, including member experiences, product education, events, and more!

How to Monitor Google Kubernetes Engine (GKE)

CaitlinHalla
Splunk Employee
Splunk Employee

We’ve looked at how to integrate Kubernetes environments with Splunk Observability Cloud, but what about integrating cloud-managed Kubernetes platforms? In this post, we’ll dig into how to monitor Google Kubernetes Engine (GKE) by integrating with Splunk Observability Cloud. 

Wait, but why?

GKE dashboards provide observability metrics around infrastructure and application health for clusters and workloads from within the Google Cloud Platform (GCP) itself. So why should we integrate GKE with an observability backend? Typically, not every piece of application infrastructure lives in the same cloud platform. In the middle of an incident when seconds matter, no one wants to navigate between platforms and remember where each discrete observability tool lives. Instead, having everything in one unified observability platform reduces toil and time to incident resolution. With end-to-end observability all in one place, root-cause analysis becomes a lot easier thanks to the ability to correlate issues impacting multiple parts of the stack. Third-party observability platforms provide advanced, flexible, and configurable dashboards, charts, and detectors along with support for application, service, and incident management integrations. Plus, depending on which platform you choose, you can avoid vendor lock-in by using an OpenTelemetry-native observability platform. 

Observability Metrics in GKE

We can monitor our applications running in Google Kubernetes Engine from GCP itself by configuring the Kubernetes Engine Monitoring setting on our cluster to System and workload wogging and monitoring. We can click into our deployed workload and see the CPU, memory, and disk utilization:

Workload overview.png

And dig into the observability details around the health of our application – like container errors and restarts: 

container errors.pngcontainer restarts.png

We can even go into Google Observability Monitoring and further explore GKE cluster performance:

observability monitoring cluster overview.png

And build custom dashboards and charts: 

dashboard.png

These things are all great and useful, but again, to unify our observability tools and improve the time to incident resolution, how can we get all of this information into a single backend observability platform?

Integrate GKE and Splunk Observability Cloud

There are a couple of ways you can integrate GKE into Splunk Observability Cloud. We’ll first follow along with the Connect to Google Cloud Platform documentation

Integrate Google Cloud Platform

From within the GCP UI, we’ll go into the project we want to monitor and add a new Splunk Service Account under IAM & admin:

service account.png

Once the new service account is created, we can edit it to create a new service account key: 

create service account key.png

And then download the key as JSON: 

download key as json.png

Note: GCP Cloud Resource Manager API must be activated so Splunk Infrastructure Monitoring can use it to validate permissions on the service account key. Also, if you want to monitor multiple GCP projects, the above steps need to be repeated for each one.

Next, we’ll navigate to Splunk Observability Cloud and add our Google Cloud Platform integration: 

GCP available integration.png

We can then import our newly created service account key:

import service account key.png

A benefit of integrating GCP in this way is that we can sync all supported GCP services automatically. However, for our purposes, we’ll optionally refine the GCP synced services to only include GKE: 

enable GKE service.png

Once saved, we’ll see Google Cloud Platform under our actively deployed integrations: 

active GCP integration.png

Depending on the polling interval, it might take a few minutes for our GCP metric data to populate within Splunk Observability Cloud.

Integrate Google Kubernetes Engine 

If you’re only interested in integrating GKE without additional GCP service integration (or if your security team won’t give you a service account), you can integrate with Splunk Observability Cloud via Helm chart. We can search for Google Kubernetes Engine in the list of available integrations and follow the guided Kubernetes Monitoring instructions:

GKE direct integration.png

We’ll specify Google Cloud Platform as the provider and Google GKE as the distribution: 

kubernetes integration with gke values.png

And then we’ll follow along with the installation instructions:

installation instructions.png

In the GKE UI, activate a cloud shell, and then enter the above commands within the instance: 

console install fail.png

You’ll notice our helm-install splunk-otel-collector command failed because of a missing instrumentation.endpoint value. This issue results from the fact that the version of helm pre-installed in the Google Cloud Shell is out of date. To resolve the error, we need to upgrade the helm version to one supported for the current Kubernetes version and rerun the commands in the Splunk Observability Cloud installation instructions: 

configure otel via GCP console.png

After this completes successfully, we can go back over to Splunk Observability Cloud and see our integrated telemetry data:

integration complete!.png

From here, we can explore data like CPU and memory usage/utilization, resource capacity, container restarts, node state, and others right alongside the rest of our application and infrastructure data within Splunk Observability Cloud, even if other parts of the app run on-premises or on another public cloud provider.  

Observability Metrics in Splunk Observability Cloud

Now that our GKE integration is complete, we can use all of the available Splunk Observability Cloud products and features to monitor our GKE environment.

From Infrastructure Monitoring we can view our Google Cloud Platform navigators: 

GCP navigator.png

And dive into the health of our clusters: 

clusterS navigator.png

Dig into a specific cluster: 

cluster navigator.png

Monitor pod health and performance: 

specific pod.png

And view critical usage metrics: 

pods critical usage.png

We can create detectors and alerting rules from within our Navigators and manage them alongside the detectors for the rest of our applications and infrastructure:

add a detector.png

In a previous post, we looked in detail at how to use the Kubernetes Navigators to Detect and Resolve Issues in a Kubernetes Environment. With GKE now integrated with Splunk Observability Cloud, we can use these same tools to proactively monitor, detect, and alert on anomalies in our GKE environment.  

Wrap up 

We can monitor our GKE environment from within GCP itself, but integrating with a backend observability platform like Splunk Observability Cloud unifies our monitoring solution. With one unified observability platform, we can more easily detect incidents and resolve them faster without having to navigate between different observability tooling.

Want to try integrating GKE with Splunk Observability Cloud for yourself? Try Splunk Observability Cloud free for 14 days!

Resources

Get Updates on the Splunk Community!

Splunk Admins and App Developers | Earn a $35 gift card!

Splunk, in collaboration with ESG (Enterprise Strategy Group) by TechTarget, is excited to announce a ...

Enterprise Security Content Update (ESCU) | New Releases

In October, the Splunk Threat Research Team had one release of new security content via the Enterprise ...

Monitoring MariaDB and MySQL

In a previous post, we explored monitoring PostgreSQL and general best practices around which metrics to ...