Splunk Search

Finding Anomalies or Outliers


Trying to find anomalies for events. I have multiple services and multiple customers. I have an error "bucket" that is caputuring events for failures, exceeded, notified, etc.
I'm looking for a way to identify when there are anomalies or outliers for each of the services/customers. I have combined (eval) service, customer, and the error and just counting the number of error events generated by each service/customer.
So for example:

would give
svcA-custA-failures 10
svcA-custA-exceeded 5
svcA-custA-notified 25
svcB-custA-failures 11
svcB-custA-exceeded 9
svcB-custA-notified 33
svcB-custB-failures 3
svcA-custB-exceeded 7
svcA-custB-notified 22
svcA-custC-exceeded 8
svcA-custC-failures 3
svcA-custC-notified 267
svcC-custC-exceeded 1
svcC-custC-failures 4
svcC-custB-notified 145
svcC-custA-notified 17


Something along the lines of this:

| eval Svc-Cust-Evnt=Svc."-".Cust."-".Evnt
| stats sum(error) by Svc-Cust-Evnt
| rename sum(error) as count
| sort -count

Labels (1)
0 Karma


It is not clear what your criteria are for determining what an anomaly is.

Also, from your example, you don't need to combine the fields, you could just to something like this

| stats sum(error) as count by Svc Cust Evnt
| sort -count
0 Karma


Each service/customer has different usage patterns...some are "normally" less busy than others. So I thought there is a way to identity when something does not follow the "normal" pattern witout putting in static thresholds? So for the service/customer that is normally slower the threshold will be less than a busier service/customer. If svcA-custA normally has 2000 events a day and svcA-custB has only 100 events a day the thresholds will be different.

Tags (2)
0 Karma


Sounds like you need some sort of baseline for each service/customer - I would suggest you store this in a summary index. You can then retrieve the relevant stat from the summary index and compare it to you current actual to determine if it is anomalous.

You could also look at the MLTK, however, for this sort of analysis, you may end up with multiple models for each service/customer combination, which becomes quite unwieldy.

0 Karma


Interesting. Can you provide a run anywhere query example of how I would do the comparison?

0 Karma
Get Updates on the Splunk Community!

Splunk Observability Cloud | Unified Identity - Now Available for Existing Splunk ...

Raise your hand if you’ve already forgotten your username or password when logging into an account. (We can’t ...

Index This | How many sides does a circle have?

February 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

Registration for Splunk University is Now Open!

Are you ready for an adventure in learning?   Brace yourselves because Splunk University is back, and it's ...