Alerting

Alerting on volume changes of data streams

danielbb
Motivator

With our cyber data, we have cases when streams of data stop, due to a down forwarder, bad DB connection etc. and cases when the streams suddenly increase in volume such as bluecoat cases, dns attack and more.

We would like to alert on these cases without hardcoding the various indexes or sourctypes. We also wonder whether there is a good way to do it in ITSI.

Labels (1)
Tags (2)
0 Karma

jschogel_splunk
Splunk Employee
Splunk Employee

Hello,

Just an idea, but maybe if you are using data models and the CIM, consider scheduling tstats searches against the data models, to count for example, the number of Authentication failures by host over the last 5 minutes.  Then you wont care the source or sourcetype, or host. So long as its in the datamodel (likely via the CIM). Something like this, where it looks for the number of failed authentications per host, over the last 5 minutes...

 

 

| tstats count from datamodel=Authentication where Authentication.action=failure earliest=-5m@m  by _time Authentication.src  
|stats  sum(count) by Authentication.src

 

 

Schedule this every 5min, as it looks back 5 minutes within the datamodel (adjust the duration depending on you acceleration times/etc). Then you'll get a decent number you can alert off of, and it will give you the  src (host) having the increase of Authentication failures for you to go investigate.  It wont matter what the Authentication source is (dns, remote login, ssh, ftp, etc, etc) so long as it's in the datamodel, you'll get an alert. Just an idea.

 

0 Karma

PickleRick
Influencer

Of course you can operate on aggregated metadata (you can do tstats count across all indexes as well) but it will not reliably tell you in changes in single data stream unless it's a big part of your general event stream.

0 Karma

PickleRick
Influencer

Well, you have to have reasonably identifiable sources. What I mean by that is that you must be able to distinguish between data streams by some set of fields (or transformation/aggregation of some of them). Otherwise you're stuck with - for example - events coming from source=/var/log/messages on a host=localhost.

So it's actually up to you to come up with proper grouping for tstats count (I wouldn't use plain stats for counting the volume of all data).

Then it's just the matter of proper timechart with timewrap and probably some foreach logic

0 Karma
Did you miss .conf21 Virtual?

Good news! The event's keynotes and many of its breakout sessions are now available online, and still totally FREE!