Alerting

How to get counts of different periods, do an avg, and define bounds

zZeb
Explorer

Hello,

I struggle to do the following:
Count the volume for last 5min from current time -7d, -14d, -21d, -28d  (basically keeping the same day of the week)

Do an avg and stdev of those counts,
Define a range based on this,
Get the count of the last 5 min from current time and tell when is out of the range
All this in a table so I can use it from Alerts

I read a lot of things, but couldn’t came up with something close enough so far, I’m still new with Splunk 😊
Thank you!

Labels (1)
0 Karma

zZeb
Explorer

Oh sorry,

Basically, an alert def will run every minute or so,
the search will count the number of events for the 4 previous same days of the week, but only the same 5’ until current time

So if it’s now 13h00, it’d count events in 12h55-13h00 for D-7, D-14, D-21, D-28,
You have like 4 values with which you can calculate an avg and stdev.

Based on this you can calculate and define a lowerBound and upperBound  (something like avg-stdev and avg+stdev)
You count events in 12h55-13h00 of today and use isOutlier to know if you’re in your defined range or not.
Table wise, that would be something like this I guess:

time period D-7 | D-14 | D-21 | D-28 | avg | stdev | upperBound | lowerBound | D | isOutlier

When possible, it also needs to be CPU friendly, there is an auto-check because they don’t like that 😋

0 Karma

PickleRick
SplunkTrust
SplunkTrust

Ok. I would probably go for summary indexing because tstats doesn't support multiple time ranges. Launch a count search, store the result and only process the pre-summarized counts later.

But the question is why would you want to spawn search each minute? That seems to be an overkill. And you might run into whole host of problems with scheduling, delays, event lag and so on. Not to mention that you're gonna be spawning many many searches throughout the day.

0 Karma

zZeb
Explorer

Do you have any code example based on your explanation? 
That would really help me

0 Karma

zZeb
Explorer

No matter how much I challenge my management, they want (like insist strongly) to know when there is no events under the 5min, basically before one of the 20k users tell us. Depending how long the job will take, I'll adapt the each minute to 5', 10', or what looks acceptable.. 

0 Karma

isoutamo
SplunkTrust
SplunkTrust
So basically your issue is know if there is some data integrations which haven't sent events event those should?

There are several apps and examples on community how this can solved.

isoutamo
SplunkTrust
SplunkTrust

Here is one old post which is discussing this issue https://community.splunk.com/t5/Splunk-Search/How-to-find-computers-which-stopped-sending-logs/m-p/6.... It contains one example and several links to other resources and apps to handle this.

0 Karma

PickleRick
SplunkTrust
SplunkTrust

A bit more words please because it's getting a bit unclear quickly.

I assume that you want to search for events

-5m till now, -7d -5m tll -7d and so on for the last 4 week.

That's pretty clear.

But after that...

What is "volume"? A count of events? Sum of their size? Something else?

What do you mean by "define a range based on this"?

 

0 Karma
Get Updates on the Splunk Community!

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud  In today’s fast-paced digital ...

Observability protocols to know about

Observability protocols define the specifications or formats for collecting, encoding, transporting, and ...

Take Your Breath Away with Splunk Risk-Based Alerting (RBA)

WATCH NOW!The Splunk Guide to Risk-Based Alerting is here to empower your SOC like never before. Join Haylee ...