Multi-user thresholding with compared time ranges

joshualarkins · ‎10-26-2016

I have a group of users to monitor. They create actions on a fairly regular basis, but they do not all follow the same pattern. Some perform this particular action 4x/hour, some 2x/hour, and some only 2x/day. What I would like to do is create a search that allows me to compare the previous hour of activity to the last 30 days and determine whether the past hour of activity is within "normal" or if their activity has dropped below a threshold.

So my thought is to calculate the avg and stdev for each hour of the day (0-23), per user. Then I could compare that data in the same search to the previous hour and see whether historical_avg_for_given_user_for_given_hour - historical_stdev_for_given_user_for_given_hour > activity_count_for_given_user_for_last_hour and if so, raise an alert for that user.

So that's the goal. I have been looking at "Time After Time – Comparing Time Ranges in Splunk" from @Anonymous, but I'm not sure how to apply that to a multi-user scenario.

Here's what I have so far:

[base search] earliest=-14d@d latest=-1d@d
| eval time_hour = strftime(_time, "%H")
| eval time_day = strftime(_time, "%D")
| stats count AS count_perhour_perday_peruser BY time_hour, time_day, userName
| chart limit=0 avg(count_perhour_perday_peruser) BY time_hour, userName

sundareshr · ‎10-26-2016

Try this

[base search]  earliest=-7d@d latest=-1d@d
| bin span=1h _time
| stats count by _time userName
| eval time_hour = strftime(_time, "%H") 
| eval time_day = strftime(_time, "%d") 
| eventstats latest(count) as current by userName
| stats latest(count) as current avg(count) as hr_avg stdev(count) as hr_stdev by userName

joshualarkins · ‎10-26-2016

Whoa. I need to look at this more, but this might be what I was looking for. I'm going to dig deeper on it tomorrow. THANK YOU.

rjthibod · ‎10-26-2016

This sounds like a use case for the Machine Learning tookit. It has the added benefit that you can play with different algorithms to see if any one is better than the other for your data.

One suggestion is to use something like the Median Absolute Deviation algorithm under the Outlier Detection categorty. It is more robust against variations compared to comparing averages and standard deviation. It is likely your data is not normaly distributed, so the standard deviation would not be a good fit.

https://splunkbase.splunk.com/app/2890/

joshualarkins · ‎10-26-2016

I need to upgrade to 6.5 and take a look at this. Thanks for pushing me to look at this. I agree, stdev probably isn't the right 'algorithm' to use long term.

Multi-user thresholding with compared time ranges

Unlock Database Monitoring with Splunk Observability Cloud

Purpose in Action: How Splunk Is Helping Power an Inclusive Future for All

[Upcoming Webinar] Demo Day: Transforming IT Operations with Splunk

Join the Conversation

Multi-user thresholding with compared time ranges

Unlock Database Monitoring with Splunk Observability Cloud

Purpose in Action: How Splunk Is Helping Power an Inclusive Future for All

[Upcoming Webinar] Demo Day: Transforming IT Operations with Splunk