Alerting

Any way to alert when source sends too many events?

mikeely
Path Finder

We've got some Java code running that was written by what appears to be a sailboat manufacturer posing as a huge software company. Occasionally (as in frequently), this Java code will spin out and generate huge (many gigabytes per hour) log files. This can show up in any number of locations and rarely hits the same file twice. As I said, they're a sailboat manufacturer - they only pretend to write software that actually works.

The problem is, when the code isn't generating gigabytes of the same error over and over, the log files are of use and we like to Splunk them. Furthermore, we'd like some kind of early-warning alert that some badly-written Java component or another is freaking out so that we can address it before the people who have to depend on this horrid mess notice that something else has gone wrong again.

So what I'm trying to figure out is a way to alert when ANY file generates more than N number of alerts per T time period. I'm pretty sure this is doable, but I've been much too busy putting out fires to even approach the method.

Tags (1)
0 Karma
1 Solution

yannK
Splunk Employee
Splunk Employee

Are you talking about the number or events per source, or the number of alert event per source, or the number of alerts raised in splunk ?

link to documentation :http://docs.splunk.com/Documentation/Splunk/latest/User/SchedulingSavedSearches

Simple approach, create search with the correct rolling time period, counting the number of events matching your pattern, then schedule it as an alert...

example
index=myindex ERROR | stats count by source
will return a line per source containing errors over the time period you specified
source count
A 10
B 1
E 5

Then schedule to run every hour over the last hour (or to run every 10 min over the last 20 min etc...)
with an email alert condition : number of result > 0

If you want to setup a threshold ( only if number or errors per source > 10 by example), then tune the search.
index=myindex ERROR | stats count by source | where count > 10

View solution in original post

yannK
Splunk Employee
Splunk Employee

Are you talking about the number or events per source, or the number of alert event per source, or the number of alerts raised in splunk ?

link to documentation :http://docs.splunk.com/Documentation/Splunk/latest/User/SchedulingSavedSearches

Simple approach, create search with the correct rolling time period, counting the number of events matching your pattern, then schedule it as an alert...

example
index=myindex ERROR | stats count by source
will return a line per source containing errors over the time period you specified
source count
A 10
B 1
E 5

Then schedule to run every hour over the last hour (or to run every 10 min over the last 20 min etc...)
with an email alert condition : number of result > 0

If you want to setup a threshold ( only if number or errors per source > 10 by example), then tune the search.
index=myindex ERROR | stats count by source | where count > 10

mikeely
Path Finder

Clever. I like it. Thanks!

0 Karma

yannK
Splunk Employee
Splunk Employee

In that case change the search.
by example calculate the number of events per second per source (based on first and last timestamp, over a 10 min period maximum, with 2 min delay)

earliest=-12m latest=-10m | stats count first(_time) AS oldestTime last(_time) AS recentTime by source | eval ratio=count/(recentTime-oldestTime) | where ratio>50 | table source count ratio

alert on the number of results (the number of source with ratio > whatever)

0 Karma

mikeely
Path Finder

That's close, and I do have some saved searches/alerts that behave like this (router flaps is a favorite of mine) but what I need here is more general. In English: "Alert whenever events from ANY single source (as Splunk defines "source") from host abc123 occur at a rate faster than 50 per second."

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...