Alerting on volume changes of data streams

danielbb · ‎09-19-2021

With our cyber data, we have cases when streams of data stop, due to a down forwarder, bad DB connection etc. and cases when the streams suddenly increase in volume such as bluecoat cases, dns attack and more.

We would like to alert on these cases without hardcoding the various indexes or sourctypes. We also wonder whether there is a good way to do it in ITSI.

nyc_jason · ‎09-19-2021

Hello,

Just an idea, but maybe if you are using data models and the CIM, consider scheduling tstats searches against the data models, to count for example, the number of Authentication failures by host over the last 5 minutes. Then you wont care the source or sourcetype, or host. So long as its in the datamodel (likely via the CIM). Something like this, where it looks for the number of failed authentications per host, over the last 5 minutes...

| tstats count from datamodel=Authentication where Authentication.action=failure earliest=-5m@m  by _time Authentication.src  
|stats  sum(count) by Authentication.src

Schedule this every 5min, as it looks back 5 minutes within the datamodel (adjust the duration depending on you acceleration times/etc). Then you'll get a decent number you can alert off of, and it will give you the src (host) having the increase of Authentication failures for you to go investigate. It wont matter what the Authentication source is (dns, remote login, ssh, ftp, etc, etc) so long as it's in the datamodel, you'll get an alert. Just an idea.

PickleRick · ‎09-19-2021

Of course you can operate on aggregated metadata (you can do tstats count across all indexes as well) but it will not reliably tell you in changes in single data stream unless it's a big part of your general event stream.

PickleRick · ‎09-19-2021

Well, you have to have reasonably identifiable sources. What I mean by that is that you must be able to distinguish between data streams by some set of fields (or transformation/aggregation of some of them). Otherwise you're stuck with - for example - events coming from source=/var/log/messages on a host=localhost.

So it's actually up to you to come up with proper grouping for tstats count (I wouldn't use plain stats for counting the volume of all data).

Then it's just the matter of proper timechart with timewrap and probably some foreach logic

Alerting on volume changes of data streams

other

Announcing Scheduled Export GA for Dashboard Studio

Extending Observability Content to Splunk Cloud

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!