Splunk Search

Finding Anomalies or Outliers

irkey
Observer

Trying to find anomalies for events. I have multiple services and multiple customers. I have an error "bucket" that is caputuring events for failures, exceeded, notified, etc.
I'm looking for a way to identify when there are anomalies or outliers for each of the services/customers. I have combined (eval) service, customer, and the error and just counting the number of error events generated by each service/customer.
So for example:
svcA
svcB
svcC
custA
custB
custC

would give
svcA-custA-failures 10
svcA-custA-exceeded 5
svcA-custA-notified 25
svcB-custA-failures 11
svcB-custA-exceeded 9
svcB-custA-notified 33
svcB-custB-failures 3
svcA-custB-exceeded 7
svcA-custB-notified 22
svcA-custC-exceeded 8
svcA-custC-failures 3
svcA-custC-notified 267
svcC-custC-exceeded 1
svcC-custC-failures 4
svcC-custB-notified 145
svcC-custA-notified 17

 

Something along the lines of this:

| eval Svc-Cust-Evnt=Svc."-".Cust."-".Evnt
| stats sum(error) by Svc-Cust-Evnt
| rename sum(error) as count
| sort -count

Labels (1)
0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

It is not clear what your criteria are for determining what an anomaly is.

Also, from your example, you don't need to combine the fields, you could just to something like this

| stats sum(error) as count by Svc Cust Evnt
| sort -count
0 Karma

irkey
Observer

Each service/customer has different usage patterns...some are "normally" less busy than others. So I thought there is a way to identity when something does not follow the "normal" pattern witout putting in static thresholds? So for the service/customer that is normally slower the threshold will be less than a busier service/customer. If svcA-custA normally has 2000 events a day and svcA-custB has only 100 events a day the thresholds will be different.

Tags (2)
0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Sounds like you need some sort of baseline for each service/customer - I would suggest you store this in a summary index. You can then retrieve the relevant stat from the summary index and compare it to you current actual to determine if it is anomalous.

You could also look at the MLTK, however, for this sort of analysis, you may end up with multiple models for each service/customer combination, which becomes quite unwieldy.

0 Karma

irkey
Observer

Interesting. Can you provide a run anywhere query example of how I would do the comparison?

0 Karma
Get Updates on the Splunk Community!

Index This | Forward, I’m heavy; backward, I’m not. What am I?

April 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

A Guide To Cloud Migration Success

As enterprises’ rapid expansion to the cloud continues, IT leaders are continuously looking for ways to focus ...

Join Us for Splunk University and Get Your Bootcamp Game On!

If you know, you know! Splunk University is the vibe this summer so register today for bootcamps galore ...