Anomaly detection using historial event count with...

furkan_caliskan · ‎04-04-2016

Hi,

I'm currently searching for a method that will help me alerting anomalies in historial event logs.

Let's say; i've a log source that generates events. In day 1, between 09:30 - 10:00 it generates total 105 events, in day 2, between 09:30 - 10:00 it generates 110 events so on and so forth...

I want to write a search which will work every 30 minutes and will find deviation (%50 for example) from average for last 2 days for the same 30 minutes period

Example:

At 10:00, search will look for past 2 days events (value 2 can be variable) and calculate average;

Yesterday, between 09:30 - 10:00; 105 events generated

The day before, between 09:30 - 10:00; 110 events generated

Average is: (105+110)/2=107,5

If (current event count for today's 09:30 - 10:00 period) > 160, it will generate an alert since %50 deviation threshold is exceeded.

(Using version 6.2.1)
Any idea?

Thanks,

jwalzerpitt · ‎04-04-2016

This may help as I cobbled it together (using it against our email index) using bits and pieces from posts in Splunk's answer form (and I believe some of lguinn's magic may be in here):

(Show where there is an increase in the number of events, the increase is a min of 80+% for previous hour and events > 10)

index=email _index_earliest=-h@h _index_latest=@h Subject="xxx" SenderAddress="xxx@xxx.edu" 
| timechart span=1h partial=false count 
| delta count as difference 
| eval difference=coalesce(difference,0) 
| eval percentDifference =round(abs(difference/(count - difference))*100) 
| where (difference > 1 AND percentDifference > 100)
| where count > 10

lguinn2 · ‎04-04-2016

This should work

earliest=-49h latest=-30m yoursearchcriteria
| eval endday2=relative_time(now(),"-49h+30m")
| eval  startday1=relative_time(now(),"-25h")
| eval endday1=relative_time(now(),"-25h+30m")
| eval starttoday=relative_time(now(),"-1h")
| eval Day = case(_time <= endday2,"day2",
                _time <= endday1 AND _time >= startday1,"day1",
                _time >= starttoday,"today",
                1==1,"eliminate")
| where Day!="eliminate"
| stats count by Day
| transpose
| eval avg=(day1+day2)/2
| where today > avg * 1.5

This seems like an ugly way to do what you want, but it was the first thing that crossed my mind. It isn't very flexible either.

Note that I have included a 30-minute lag in the calculations - without a lag, your numbers may not be accurate, as it often takes a minute or so for events to be detected on the remote server, forwarded, parsed and indexed.

You might want to look at the timewrap app - it's free and designed to help with these sorts of comparisons.

furkan_caliskan · ‎04-04-2016

When I run it, it gives "Error in 'eval' command: The arguments to the 'if' function are invalid."

lguinn2 · ‎04-04-2016

Should be case not if! Sorry, I have edited the answer...

jcvytla · ‎04-10-2018

@Iguinn

I'm trying to use machine learning toolkit. In which assistant , shall I use the code you have posted above?

Anomaly detection using historial event count with help of hour-based time slices

The day before, between 09:30 - 10:00; 110 events generated

Index This | When is October more than just the tenth month?

Observe and Secure All Apps with Splunk

What’s New & Next in Splunk SOAR

Are you a member of the Splunk Community?

Anomaly detection using historial event count with help of hour-based time slices

The day before, between 09:30 - 10:00; 110 events generated

Index This | When is October more than just the tenth month?

Observe and Secure All Apps with Splunk

What’s New & Next in Splunk SOAR