Splunk Search

Multi-user thresholding with compared time ranges

joshualarkins
Explorer

I have a group of users to monitor. They create actions on a fairly regular basis, but they do not all follow the same pattern. Some perform this particular action 4x/hour, some 2x/hour, and some only 2x/day. What I would like to do is create a search that allows me to compare the previous hour of activity to the last 30 days and determine whether the past hour of activity is within "normal" or if their activity has dropped below a threshold.

So my thought is to calculate the avg and stdev for each hour of the day (0-23), per user. Then I could compare that data in the same search to the previous hour and see whether historical_avg_for_given_user_for_given_hour - historical_stdev_for_given_user_for_given_hour > activity_count_for_given_user_for_last_hour and if so, raise an alert for that user.

So that's the goal. I have been looking at "Time After Time – Comparing Time Ranges in Splunk" from @lguinn, but I'm not sure how to apply that to a multi-user scenario.

Here's what I have so far:

[base search] earliest=-14d@d latest=-1d@d
| eval time_hour = strftime(_time, "%H")
| eval time_day = strftime(_time, "%D")
| stats count AS count_perhour_perday_peruser BY time_hour, time_day, userName
| chart limit=0 avg(count_perhour_perday_peruser) BY time_hour, userName

Tags (1)
0 Karma

sundareshr
Legend

Try this

[base search]  earliest=-7d@d latest=-1d@d
| bin span=1h _time
| stats count by _time userName
| eval time_hour = strftime(_time, "%H") 
| eval time_day = strftime(_time, "%d") 
| eventstats latest(count) as current by userName
| stats latest(count) as current avg(count) as hr_avg stdev(count) as hr_stdev by userName
0 Karma

joshualarkins
Explorer

Whoa. I need to look at this more, but this might be what I was looking for. I'm going to dig deeper on it tomorrow. THANK YOU.

0 Karma

rjthibod
Champion

This sounds like a use case for the Machine Learning tookit. It has the added benefit that you can play with different algorithms to see if any one is better than the other for your data.

One suggestion is to use something like the Median Absolute Deviation algorithm under the Outlier Detection categorty. It is more robust against variations compared to comparing averages and standard deviation. It is likely your data is not normaly distributed, so the standard deviation would not be a good fit.

https://splunkbase.splunk.com/app/2890/

0 Karma

joshualarkins
Explorer

I need to upgrade to 6.5 and take a look at this. Thanks for pushing me to look at this. I agree, stdev probably isn't the right 'algorithm' to use long term.

0 Karma
Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...