Hello Splunkers,
I would like to have to set an alert if a sudden high amount of events are received.
I have this base search:
index=_internal source="*metrics.log" eps "group=per_source_thruput" NOT filetracker | eval events=eps*kb/kbps
| timechart fixedrange=t span=1m limit=5 sum(events) by series
So I have the number of events by a source per minute. I like to trigger an alert if there are more than X events in 5 consecutive minutes from one source.
Thanks for your hints in advance
Try something like this
index=_internal source="*metrics.log" eps "group=per_source_thruput" NOT filetracker | eval events=eps*kb/kbps
| timechart fixedrange=t span=1m limit=5 sum(events) by series
| untable _time series count
| sort 0 series _time
| streamstats current=t window=5 count(eval(count>X)) as rollingHighCount by series
| where rollingHighCount=5
Replace the X in streamstats command with your number.
Try something like this
index=_internal source="*metrics.log" eps "group=per_source_thruput" NOT filetracker | eval events=eps*kb/kbps
| timechart fixedrange=t span=1m limit=5 sum(events) by series
| untable _time series count
| sort 0 series _time
| streamstats current=t window=5 count(eval(count>X)) as rollingHighCount by series
| where rollingHighCount=5
Replace the X in streamstats command with your number.
I tuning a bit, but BIG thanks for the concept!
index=_internal source="*metrics.log" eps "group=per_source_thruput" NOT filetracker | eval events=eps*kb/kbps
| timechart fixedrange=t span=1m limit=5 sum(events) by series
| untable _time series count
| sort _time 0 series
| streamstats current=t time_window=5m count(eval(count>X)) as rollingHighCount by series
| where rollingHighCount=5
Hi,
This returns with a strange result. (If I do not remove the | timechart... line from the original search, there is no result.) But, then the result somehow not including my firewalls, where the event per minute is over 100000...
Running this every 5 minutes will show the "top list" of the series in that five minutes, but I really looking for the peaks. Running my original search every 3 hours will show the peaks pretty well:
But I want to have an email alert in case the events per minute go over a limit. For example, if the "normal" is 100000/min, but then it goes up 250000/min and then back to 100000/min that's OK I do not want to have an alert. But if it stays on the 250000/min level (that is set as X) for more than 5 minutes continuously, I would like to have the alert. ( and I check the behavior later)
"Strange" while colourful is not particularly descriptive. Without knowledge of your events, and based on what you appear to have been using e.g. events holds some sort of count, summing those counts every minute for the past 5 minutes (by series - whatever that is from your events) would give you totals for each of the 5 minutes. By counting the number of stats events with totals above your threshold would give you the number of minutes each series breached your threshold in the last 5 minutes. Is this not what you were trying to find out? If not, please provide example events and/or a clearer explanation of what you have tried, what you got as a result, and why it is not what you were after.
I mean with the "strange", that your search returns totally different results than my search 🙂
My goal is: to monitor the number of events per series per minute - the flow itself. On top of that if there is a peak, like 3-4x more events per minute than usual for a longer period (5-10 minutes), raise an alert. This suddenly increased traffic on network devices/firewalls could be a good indicator of an attack or some issue.
Apart from counting per minute and then counting how many minutes are over the threshold, you could look at the Machine Learning ToolKit (MLTK) from SplunkBase, which is quite good for building models of normal patterns and detecting anomalies - which is essentially what you are trying to do.
Based on your search, schedule your report to run every minute using earliest=-5m@m and latest=@m
| bin _time span=1m
| stats sum(events) as events by _time series
| where events > X
| stats count by series
| where count == 5