Splunk Search

How to set an alert for high amount of events?

norbertt911
Communicator

Hello Splunkers,

I would like to have to set an alert if a sudden high amount of events are received. 

I have this base search:

index=_internal source="*metrics.log" eps "group=per_source_thruput" NOT filetracker | eval events=eps*kb/kbps
| timechart fixedrange=t span=1m limit=5 sum(events) by series

So I have the number of events by a source per minute.  I like to trigger an alert if there are more than X events in 5 consecutive minutes from one source.

Thanks for your hints in advance

Labels (3)
Tags (2)
0 Karma
1 Solution

somesoni2
Revered Legend

Try something like this

index=_internal source="*metrics.log" eps "group=per_source_thruput" NOT filetracker | eval events=eps*kb/kbps
| timechart fixedrange=t span=1m limit=5 sum(events) by series 
| untable _time series count 
| sort 0 series _time
| streamstats current=t window=5 count(eval(count>X)) as rollingHighCount by series 
| where rollingHighCount=5

 Replace the X in streamstats command with your number. 

View solution in original post

somesoni2
Revered Legend

Try something like this

index=_internal source="*metrics.log" eps "group=per_source_thruput" NOT filetracker | eval events=eps*kb/kbps
| timechart fixedrange=t span=1m limit=5 sum(events) by series 
| untable _time series count 
| sort 0 series _time
| streamstats current=t window=5 count(eval(count>X)) as rollingHighCount by series 
| where rollingHighCount=5

 Replace the X in streamstats command with your number. 

norbertt911
Communicator

I tuning a bit, but BIG thanks for the concept!

index=_internal source="*metrics.log" eps "group=per_source_thruput" NOT filetracker | eval events=eps*kb/kbps
| timechart fixedrange=t span=1m limit=5 sum(events) by series
| untable _time series count
| sort _time 0 series
| streamstats current=t time_window=5m count(eval(count>X)) as rollingHighCount by series
| where rollingHighCount=5

0 Karma

norbertt911
Communicator

Hi,

This returns with a strange result. (If I do not remove the | timechart... line from the original search, there is no result.) But, then the result somehow not including my firewalls, where the event per minute is over 100000...

Running this every 5 minutes will show the "top list" of the series in that five minutes, but I really looking for the peaks. Running my original search every 3 hours will show the peaks pretty well: 

norbertt911_0-1679326501773.png

But I want to have an email alert in case the events per minute go over a limit. For example, if the "normal" is 100000/min, but then it goes up 250000/min and then back to 100000/min that's OK I do not want to have an alert. But if it stays on the 250000/min level (that is set as X) for more than 5 minutes continuously, I would like to have the alert. ( and I check the behavior later)

 

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

"Strange" while colourful is not particularly descriptive. Without knowledge of your events, and based on what you appear to have been using e.g. events holds some sort of count, summing those counts every minute for the past 5 minutes (by series - whatever that is from your events) would give you totals for each of the 5 minutes. By counting the number of stats events with totals above your threshold would give you the number of minutes each series breached your threshold in the last 5 minutes. Is this not what you were trying to find out? If not, please provide example events and/or a clearer explanation of what you have tried, what you got as a result, and why it is not what you were after.

0 Karma

norbertt911
Communicator

I mean with the "strange", that your search returns totally different results than my search 🙂

My goal is: to monitor the number of events per series per minute - the flow itself.  On top of that if there is a  peak, like 3-4x more events per minute than usual for a longer period (5-10 minutes), raise an alert. This suddenly increased traffic on network devices/firewalls could be a good indicator of an attack or some issue.

 

 

 

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Apart from counting per minute and then counting how many minutes are over the threshold, you could look at the Machine Learning ToolKit (MLTK) from SplunkBase, which is quite good for building models of normal patterns and detecting anomalies - which is essentially what you are trying to do.

ITWhisperer
SplunkTrust
SplunkTrust

Based on your search, schedule your report to run every minute using earliest=-5m@m and latest=@m

| bin _time span=1m
| stats sum(events) as events by _time series
| where events > X
| stats count by series
| where count == 5
0 Karma
Get Updates on the Splunk Community!

Splunk Enterprise Security 8.0.2 Availability: On cloud and On-premise!

A few months ago, we released Splunk Enterprise Security 8.0 for our cloud customers. Today, we are excited to ...

Logs to Metrics

Logs and Metrics Logs are generally unstructured text or structured events emitted by applications and written ...

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...