Splunk Search

Finding Anomalies or Outliers

irkey
Observer

Trying to find anomalies for events. I have multiple services and multiple customers. I have an error "bucket" that is caputuring events for failures, exceeded, notified, etc.
I'm looking for a way to identify when there are anomalies or outliers for each of the services/customers. I have combined (eval) service, customer, and the error and just counting the number of error events generated by each service/customer.
So for example:
svcA
svcB
svcC
custA
custB
custC

would give
svcA-custA-failures 10
svcA-custA-exceeded 5
svcA-custA-notified 25
svcB-custA-failures 11
svcB-custA-exceeded 9
svcB-custA-notified 33
svcB-custB-failures 3
svcA-custB-exceeded 7
svcA-custB-notified 22
svcA-custC-exceeded 8
svcA-custC-failures 3
svcA-custC-notified 267
svcC-custC-exceeded 1
svcC-custC-failures 4
svcC-custB-notified 145
svcC-custA-notified 17

 

Something along the lines of this:

| eval Svc-Cust-Evnt=Svc."-".Cust."-".Evnt
| stats sum(error) by Svc-Cust-Evnt
| rename sum(error) as count
| sort -count

Labels (1)
0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

It is not clear what your criteria are for determining what an anomaly is.

Also, from your example, you don't need to combine the fields, you could just to something like this

| stats sum(error) as count by Svc Cust Evnt
| sort -count
0 Karma

irkey
Observer

Each service/customer has different usage patterns...some are "normally" less busy than others. So I thought there is a way to identity when something does not follow the "normal" pattern witout putting in static thresholds? So for the service/customer that is normally slower the threshold will be less than a busier service/customer. If svcA-custA normally has 2000 events a day and svcA-custB has only 100 events a day the thresholds will be different.

Tags (2)
0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Sounds like you need some sort of baseline for each service/customer - I would suggest you store this in a summary index. You can then retrieve the relevant stat from the summary index and compare it to you current actual to determine if it is anomalous.

You could also look at the MLTK, however, for this sort of analysis, you may end up with multiple models for each service/customer combination, which becomes quite unwieldy.

0 Karma

irkey
Observer

Interesting. Can you provide a run anywhere query example of how I would do the comparison?

0 Karma
Get Updates on the Splunk Community!

Share Your Ideas & Meet the Lantern team at .Conf! Plus All of This Month’s New ...

Splunk Lantern is Splunk’s customer success center that provides advice from Splunk experts on valuable data ...

Combine Multiline Logs into a Single Event with SOCK: a Step-by-Step Guide for ...

Combine multiline logs into a single event with SOCK - a step-by-step guide for newbies Olga Malita The ...

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars this month. This ...