We're looking to identify the users that connect the most within a 60 second window. Currently our search looks like this:
*search* | bucket _time span=60s | stats count by user, _time | sort -count
Unfortunately that only looks at every minute of the hour starting at zero seconds. We need to be able to check for any given minute. I've been attempting to use streamstats with no luck. Any ideas would be greatly appreciated!
I am facing a similar challenge above, ie to find firewall denies to same destination in a rolling 1 minute window which exceeds a threshold (eg 100). example below should generate an alert
Since the deny count for time span span from 00:00:31 - 00:01:30 is 120.
Using the traditional |bin span=1m, this does not generate an alert since each minute bucket (00:00, 00:01, 00:02) does not exceed the threshold.
There is an "aligntime" option in |bin, but that requires a relative time/epoch time, so that does not work either (may be possible if we can set it to the _time value for when an event is first observed.)
I also explored using |streamstats with a time_window option, however that does not work either since the counter resets when the dest changes, so if the total failures per dest is interleaved with other denies, then it will not fire an alert either.
Hopefully someone has found a solution out there.
A dirty solution I found is to do this
<base_search> |bin span=1s _time |stats count by _time dest|eval startime=_time |eval endtime=startime+60 |eval threshold=100 |map search="search earliest=$startime$ latest=$endtime$ <base_search> dest=$dest$ |stats count by dest |where count > $threshold$"
but this is really bad as it will trigger hundreds of searches (however many the base search returns). it improves a bit if I increase the span, but if its too high then its the same problem all over again.
Not quite - sorry if I wasn't clear. The bucket _time span = 60s is forcing Splunk to look at every mine. For example: 12:00:00 - 12:01:00, 12:01:00 - 12:02:00, etc.
I need that bucket to move, or to capture the highest connection totals for any 60 second window. For example 12:00:34 - 12:01:34. I hope that clarifies things!