This is the fourth post in the Splunk Observability Cloud’s AI Assistant in Action series that digs into how to use the Splunk AI Assistant by exploring practical, real-world, real-time examples. In this series, we go into the specific Splunk AI Assistant use cases of:
If you’d like to start with how to access the Splunk AI Assistant and identify unknown unknowns within a service environment, check out the first post in this series: Identifying Unknown Unknowns. In our second post, Analyzing and Troubleshooting in Real-Time, you can see how to use the AI Assistant to analyze trace errors and database query performance and gain insights into detectors and alerts. Our third post digs into how to leverage the Splunk AI Assistant to ensure compliance and optimize cost requirements.
In this fourth post, we’ll explore how the Splunk AI Assistant can help explain unfamiliar or custom metrics and how it can provide real-time, contextual feedback to help analyze performance, identify optimizations, and troubleshoot faster for a reduced Mean Time to Resolution (MTTR).
Within a company’s engineering organization, different teams may implement their own custom metrics specific to the services they own and manage. When troubleshooting an incident that spans services and teams, navigating these custom metrics and their meanings can slow down investigation and MTTR. Not only that, but engineers often utilize third-party tools that can be instrumented with custom metrics that don’t conform to internal standards or conventions. Interpreting these metrics can have a similar negative impact on troubleshooting time and MTTR. Let’s look at an example.
Our engineering team utilizes the data structure server, Redis, but I haven’t spent a lot of time working with Redis, so I’m unfamiliar with all of the metrics associated with our Redis instance. I’m on-call and dealing with an active incident, and I’m seeing a lot of Redis metrics pop up around cache hit rate percentage. I’m not sure what this means, but I can use the Splunk AI Assistant to gain more context and troubleshoot the issue.
From within Splunk Observability Cloud, I’ll navigate to Infrastructure and then select Datastores. I’ll scroll down to Redis and select Redis instances to get a detailed overview of my Redis instances:
From within the Redis instance view, I can open the AI Assistant and ask detailed questions about specific instance metrics, like this cache hit rate percentage. I can also ask it for contextual data to help me determine if the metric data is good or bad.
In my prompt, I’m going to specify the instance I’m investigating because, as you’ll remember from our previous post, Analyzing and Troubleshooting in Real-Time, the more context you provide to the AI Assistant, the more complete responses will be:
I’ve asked the AI Assistant to explain what the cache hit rate percentage metric is and how I can interpret it. The response returns analysis for this specific metric, showing that over the last four hours, the hit rate was 67%.
The interpretation of the contextual data in the response suggests that this is a moderate hit rate, indicating there is room for optimization. More details on suggestions for further optimization include reviewing the cache configuration, analyzing miss patterns, and application logic:
The response also provides important relevant Infrastructure attributes along with relevant dashboards and charts, complete with hyperlinks for further insight:
To summarize the use case explored in this post, we’ve utilized the Splunk AI Assistant to explain unfamiliar metrics and provide real-time, contextual feedback, enabling us to troubleshoot quickly and reduce MTTR.
In our next post, we’ll use the AI Assistant to streamline the onboarding process for new hires or new users of Splunk Observability Cloud to get everyone up to speed fast.
Want to try out the Splunk AI Assistant for yourself? Start with a 14-day free trial! Already a Splunk Observability Cloud customer? Reach out to your account representative to enable the Splunk AI Assistant!
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.