Splunk Search
Highlighted

Find out if count at a specific time is below the average

Path Finder

UPDATE:

I have created a search/alert that should notify me if:

  1. Index data is 0 for a particular hour
  2. Index data count is below the normal 5 percentile of the count (Mean - 2*Standard Deviation)

    | tstats count WHERE earliest=-30d@-3h latest=now index=* by index, _time span=1h | makecontinuous span=1h _time |  eval count=if(isnull(count),0,count) 
    | eval time_group = floor(tonumber(strftime(_time,"%H"))/3)
    | bin _time as myday span=1d
    | eval weekday=strftime(_time,"%a")
    | where time_group=floor(tonumber(strftime(now(),"%H"))/3)-1 AND weekday=strftime(now(),"%a")
    | eventstats min(_time) as _time sum(count) AS ThreeHourCount avg(count) as MonthlyAverageCount stdev(count) as MonthlyStdDev by index time_group myday weekday
    | eval MonthlyAverageCount=round(MonthlyAverageCount,2), MonthlyStdDev=round(MonthlyStdDev,2)
    | where strftime(now(),"%Y-%m-%d")=strftime(_time,"%Y-%m-%d") AND (count=0 OR ThreeHourCount<WeeklyAverageCount-(2*WeeklyStdDev))
    

So I start with checking every hour, then putting everything in 3 hour blocks (since my search will be running every 3 hours.

Based on these searches, do you see anything wrong with my steps? any room for improvement? Also this in the cloud takes about 40 seconds. any way possible to make it faster?

0 Karma
Highlighted

Re: Find out if count at a specific time is below the average

SplunkTrust
SplunkTrust

One way to do this could be like so. For testing, start with the first line and add one line at a time so you can confirm what it's doing at each step:

| tstats count AS hourlyCount WHERE latest=@h earliest=-24h@h index=* by index, _time span=1h 
| eventstats avg(hourlyCount) as AverageCountPerDay by index
| sort - _time
| head 1

The first is much like your own tstats except I time-limit it to the last 24 hours (snapping to the hour mark in both cases so that I remove partial hours, which obviously will have a messed up count because they're only partially there).

We then use eventstats to build our overall average hourly count for each index.

The remaining are simple, we sort by _time to make sure it's sorted right so we can then take the first one of them with the head command

Happy Splunking!

Highlighted

Re: Find out if count at a specific time is below the average

SplunkTrust
SplunkTrust

@mkarimi17 - I would STRONGLY suggest that you use something other than avg to determine your alert. By definition, that's going to alert half the time. By real life occurrences, probably more like 2/3 of the time.

More likely, you probably want something along the lines of "below the 2nd percentile", or "2.5 stdevs below the average".

0 Karma
Highlighted

Re: Find out if count at a specific time is below the average

SplunkTrust
SplunkTrust

Great point, DalJeanis. I should have thought of that myself.

mkarimi17, the below is one option to pull out hours where the hourly count was greater than the average plus two standard deviations (about 95% threshold). Or, "Show me data in the last 24 hours where the hourly count was greater than 95% of all counts." Or "Show me the top 5th percentile of activity."

| tstats count AS hourlyCount WHERE latest=@h earliest=-24h@h index=* by index, _time span=1h 
| eventstats avg(hourlyCount) as AverageCountPerDay stdev(hourlyCount) as StdDev by index 
| where hourlyCount> AverageCountPerDay+(2*StdDev)

If you run that, you might get one hit each day for your "peak hour". You could do 3x your standard deviation, too - that would be 99.7% - or in other words, only if the hours really unusual!

But, now that we have that piece, switch it around to like I originally had it, but at that at the end.

| tstats count AS hourlyCount WHERE latest=@h earliest=-24h@h index=* by index, _time span=1h 
| eventstats avg(hourlyCount) as AverageCountPerDay stdev(hourlyCount) as StdDev by index 
| sort - _time 
| head 1
| where hourlyCount> AverageCountPerDay+(2*StdDev)

So, we mostly do it like before except we add a standard devitation into the eventstats so we have that number to work with, then at the very end we filter out that most recent record UNLESS it happens to be anomalous (>95%, which is 2x stdev). You could then alert hourly on any result.

Without the |sort, |head and |where in there, you could timechart it nicely too for a dashboard.

0 Karma
Highlighted

Re: Find out if count at a specific time is below the average

Path Finder

How does this look?

| tstats count AS hourlyCount WHERE earliest=-7d@d index=* by index, _time span=1h | eval now_hour=strftime(now(),"%H") | eval time_hour=strftime(_time,"%H") | where time_hour=now_hour
 | eventstats avg(hourlyCount) as AverageCountPerDay stdev(hourlyCount) as StdDev by index | where hourlyCount<AverageCountPerDay-(2*StdDev) | table _time, index, hourlyCount, AverageCountPerDay, StdDev

Going back one week, looking for that specific hour and comparing those times.

0 Karma
Highlighted

Re: Find out if count at a specific time is below the average

Path Finder

thoughts on the following:

| tstats count WHERE earliest=-7d@d latest=now index=* by index, _time span=1h 
| eval group_by = floor(tonumber(strftime(_time,"%H"))/3) 
| bin _time as myday span=1d
| eventstats max(_time) as _time sum(count) AS ThreeHourCount avg(count) as AverageCount stdev(count) as StdDev by index group_by myday
| eval now_hour=strftime(now(),"%H") | eval time_hour=strftime(_time,"%H") | eval time=strftime(_time,"%Y-%m-%d %H:%M")
| where time_hour=now_hour-1 and ThreeHourCount<AverageCount-(2*StdDev)
0 Karma