Splunk Search

Finding Anomalies or Outliers

irkey
Observer

Trying to find anomalies for events. I have multiple services and multiple customers. I have an error "bucket" that is caputuring events for failures, exceeded, notified, etc.
I'm looking for a way to identify when there are anomalies or outliers for each of the services/customers. I have combined (eval) service, customer, and the error and just counting the number of error events generated by each service/customer.
So for example:
svcA
svcB
svcC
custA
custB
custC

would give
svcA-custA-failures 10
svcA-custA-exceeded 5
svcA-custA-notified 25
svcB-custA-failures 11
svcB-custA-exceeded 9
svcB-custA-notified 33
svcB-custB-failures 3
svcA-custB-exceeded 7
svcA-custB-notified 22
svcA-custC-exceeded 8
svcA-custC-failures 3
svcA-custC-notified 267
svcC-custC-exceeded 1
svcC-custC-failures 4
svcC-custB-notified 145
svcC-custA-notified 17

 

Something along the lines of this:

| eval Svc-Cust-Evnt=Svc."-".Cust."-".Evnt
| stats sum(error) by Svc-Cust-Evnt
| rename sum(error) as count
| sort -count

Labels (1)
0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

It is not clear what your criteria are for determining what an anomaly is.

Also, from your example, you don't need to combine the fields, you could just to something like this

| stats sum(error) as count by Svc Cust Evnt
| sort -count
0 Karma

irkey
Observer

Each service/customer has different usage patterns...some are "normally" less busy than others. So I thought there is a way to identity when something does not follow the "normal" pattern witout putting in static thresholds? So for the service/customer that is normally slower the threshold will be less than a busier service/customer. If svcA-custA normally has 2000 events a day and svcA-custB has only 100 events a day the thresholds will be different.

Tags (2)
0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Sounds like you need some sort of baseline for each service/customer - I would suggest you store this in a summary index. You can then retrieve the relevant stat from the summary index and compare it to you current actual to determine if it is anomalous.

You could also look at the MLTK, however, for this sort of analysis, you may end up with multiple models for each service/customer combination, which becomes quite unwieldy.

0 Karma

irkey
Observer

Interesting. Can you provide a run anywhere query example of how I would do the comparison?

0 Karma
Get Updates on the Splunk Community!

Introducing the Splunk Community Dashboard Challenge!

Welcome to Splunk Community Dashboard Challenge! This is your chance to showcase your skills in creating ...

Built-in Service Level Objectives Management to Bridge the Gap Between Service & ...

Wednesday, May 29, 2024  |  11AM PST / 2PM ESTRegister now and join us to learn more about how you can ...

Get Your Exclusive Splunk Certified Cybersecurity Defense Engineer Certification at ...

We’re excited to announce a new Splunk certification exam being released at .conf24! If you’re headed to Vegas ...