Splunk Search

Finding Anomalies or Outliers


Trying to find anomalies for events. I have multiple services and multiple customers. I have an error "bucket" that is caputuring events for failures, exceeded, notified, etc.
I'm looking for a way to identify when there are anomalies or outliers for each of the services/customers. I have combined (eval) service, customer, and the error and just counting the number of error events generated by each service/customer.
So for example:

would give
svcA-custA-failures 10
svcA-custA-exceeded 5
svcA-custA-notified 25
svcB-custA-failures 11
svcB-custA-exceeded 9
svcB-custA-notified 33
svcB-custB-failures 3
svcA-custB-exceeded 7
svcA-custB-notified 22
svcA-custC-exceeded 8
svcA-custC-failures 3
svcA-custC-notified 267
svcC-custC-exceeded 1
svcC-custC-failures 4
svcC-custB-notified 145
svcC-custA-notified 17


Something along the lines of this:

| eval Svc-Cust-Evnt=Svc."-".Cust."-".Evnt
| stats sum(error) by Svc-Cust-Evnt
| rename sum(error) as count
| sort -count

Labels (1)
0 Karma


It is not clear what your criteria are for determining what an anomaly is.

Also, from your example, you don't need to combine the fields, you could just to something like this

| stats sum(error) as count by Svc Cust Evnt
| sort -count
0 Karma


Each service/customer has different usage patterns...some are "normally" less busy than others. So I thought there is a way to identity when something does not follow the "normal" pattern witout putting in static thresholds? So for the service/customer that is normally slower the threshold will be less than a busier service/customer. If svcA-custA normally has 2000 events a day and svcA-custB has only 100 events a day the thresholds will be different.

Tags (2)
0 Karma


Sounds like you need some sort of baseline for each service/customer - I would suggest you store this in a summary index. You can then retrieve the relevant stat from the summary index and compare it to you current actual to determine if it is anomalous.

You could also look at the MLTK, however, for this sort of analysis, you may end up with multiple models for each service/customer combination, which becomes quite unwieldy.

0 Karma


Interesting. Can you provide a run anywhere query example of how I would do the comparison?

0 Karma
Get Updates on the Splunk Community!

.conf24 | Day 0

Hello Splunk Community! My name is Chris, and I'm based in Canberra, Australia's capital, and I travelled for ...

Enhance Security Visibility with Splunk Enterprise Security 7.1 through Threat ...

 (view in My Videos)Struggling with alert fatigue, lack of context, and prioritization around security ...

Troubleshooting the OpenTelemetry Collector

  In this tech talk, you’ll learn how to troubleshoot the OpenTelemetry collector - from checking the ...