Splunk Log Observer Connect

CaitlinHalla · ‎08-22-2024

Introduction

We know three key players in observability are metrics, traces, and logs. Metrics help you detect problems within your system. Traces help you troubleshoot where the problems are occurring. Logs help you pinpoint root causes. These observability components, (along with others), work together to help you remediate issues quickly.

In our previous post, we discussed how Splunk Observability Cloud can help us detect and troubleshoot problems specifically in our Kubernetes environment. But how can we use our telemetry data to identify exactly what’s causing the problems in the first place? In this post, let’s dig into Splunk Log Observer Connect and see how we can diagnose and resolve issues fast.

Splunk Log Observer Connect Overview

Splunk Log Observer Connect is an integration that makes it possible to query log data from your existing Splunk Platform products (Enterprise or Cloud) and use the data alongside metrics and traces all from within Splunk Observability Cloud. If you’re a Splunk Enterprise or Splunk Cloud Platform customer, you can use Log Observer Connect to view in-context logs, run queries without SPL, and jump to Related Content with one easy click to quickly detect and resolve system problems.

You can get started with Log Observer Connect by following the setup steps or working with your Support team to add a new connection for Log Observer Connect in Splunk Observability Cloud.

Using the native OpenTelemetry logging capabilities deployed as part of the Helm chart included in the Splunk Distribution of the OpenTelemetry Collector is the recommended way to get logs from Kubernetes environments into Splunk. You can also configure logging during the initial integration process of the OTel Collector by specifying Log collection and providing your Splunk HEC endpoint and access token:

install configuration.png

Splunk Log Observer Connect in Action

Interacting with logs in Splunk Observability Cloud often begins with an alert triggered by some error event like a problem with a Kubernetes cluster.

In Splunk Infrastructure Monitoring we can see in the Kubernetes Navigator, (which we toured in a previous post), that we have two such active alerts firing:

kubernetes navigator.png

Opening them up, we can see critical alerts for memory usage. With a single click, we can Explore Further in Splunk Application Performance Monitoring by clicking on the Troubleshoot link:

This will take us to a Service Map view of our application where we can see something isn’t right:

service map with logs.png

Our paymentservice node is highlighted in red, meaning it’s the source of the root cause of our errors. If we select the red circle, we’ll see more info in the panel on the right and Infrastructure and Logs Related Content at the bottom of the screen. All of this information is specifically scoped to the selected paymentservice.

We can expand the Logs Related Content:

service map expanded logs.png

And then jump directly from there to Log Observer to view logs related to this service error:

Log Observer.png

With help from the logs, we can get to the bottom of what’s causing these errors. Let’s add some additional filters in the Content Control Bar to filter our logs by keywords or field values. Since we arrived via Related Content, we already have logs filtered to service.name = paymentservice. If we only wanted to see logs related to paymentservice errors, we could add another filter for severity = error:

severity error filter.png

If at any time we wanted to save a query to later validate a fix or share it with the rest of our team, we could add it to our Saved Queries. Select Save at the top right of the screen, then Save query to name and describe the saved query for later use:

Screenshot 2024-08-14 at 10.50.46 AM.png

Other users and/or your future self can use the Saved Query dropdown to later apply your saved query:

Screenshot 2024-08-14 at 10.59.32 AM.png

Moving over to the Fields panel on the right of the screen, we can view all available metadata present on entries in the Logs table. This is a great place to filter logs if you’re unsure of what fields you’re looking for. Here, we can see there’s a k8s.cluster.name field with the top values listed. In this case, we know which Kubernetes cluster we want to isolate, so we can include all logs for our specific cluster of interest:

logs filtered by cluster copy.png

We can then click on an individual log entry to see its details:

logs filtered by cluster.png

From the log details, we can see that the error message is “Failed payment processing through ButtercupPayments: Invalid API Token(test-20e26e90-356b-432e-a2c6-956fc03f5609).” Selecting the error message, we can filter further with a single click of Add to filter to ensure all logs are scoped to those with this error message:

add error message.png

We’ve also added the version field as a column by selecting the kebab menu next to the field followed by Add field as column:

add field to column.png

Now we can easily scan the Logs table and identify which errors are associated with which version. At a glance, it appears that all the error logs are related to the same version number. Suspiciously, if we look for the version field in the Fields list, we can see that there is in fact only one version scoped to the current error logs:

Before we jump to resolutions, we can continue to interact with the log details and move through our system. We can explore the traces related to this error and see where in our code this error is being thrown. This could help us track down any recent code changes that may have potentially caused this error. We can simply click on the trace_id in the log detail and then View trace_id to jump to the trace back in Splunk APM:

trace id details.png

If we open up one of the span errors and go into Tag Spotlight for the version trace property, we can confirm our suspicions – only our latest release is experiencing this “Invalid API token error”:

version spotlight before.png

version spotlight after.png

If we had first discovered the error while investigating this trace, we could have initially gotten to Log Observer via the Related Content from either the trace:

trace with logs.png

Or the Tag Spotlight:

tag spotlight related content.png

Wrap Up

We used Log Observer Connect to easily locate the cause of our errors. Thanks to the ability to move between Splunk Infrastructure Monitoring, APM, and Log Observer, we were able to confidently move forward with a fix 🎉.

If you want to connect your Splunk Enterprise or Splunk Cloud Platform logs to Splunk Observability Cloud using Splunk Log Observer Connect, again, check out the Introduction to Splunk Log Observer Connect.

New to Splunk and want to get started with Splunk Observability Cloud? Start a 14 day free trial!

Splunk Log Observer Connect

Introduction

Splunk Log Observer Connect Overview

Splunk Log Observer Connect in Action

Wrap Up

Modernize your Splunk Apps – Introducing Python 3.13 in Splunk

New Release | Splunk Cloud Platform 10.1.2507

🌟 From Audit Chaos to Clarity: Welcoming Audit Trail v2

Are you a member of the Splunk Community?

Splunk Log Observer Connect

Introduction

Splunk Log Observer Connect Overview

Splunk Log Observer Connect in Action

Wrap Up

Modernize your Splunk Apps – Introducing Python 3.13 in Splunk

New Release | Splunk Cloud Platform 10.1.2507

🌟 From Audit Chaos to Clarity: Welcoming Audit Trail v2