Splunk Search
Highlighted

Comparing counts of previous hour with the counts from the same day and hour of the previous 3 weeks

New Member

Looking for a little help comparing a count of the past hour with the count from the same hour from the 3 previous weeks and to report if it is lower than the average of the last 3 weeks.

For example, i want to count how many times a specific uri is hit in the past hour today which is Thursday. I want to compare that count with the previous 3 Thursdays count for that specific hour and to trigger an alert if todays count is lower than the average of the last 3 weeks.

Thanks in advance!

0 Karma
Highlighted

Re: Comparing counts of previous hour with the counts from the same day and hour of the previous 3 weeks

SplunkTrust
SplunkTrust

Hi tfernalld,
take a look at this app: https://splunkbase.splunk.com/app/1645/

0 Karma
Highlighted

Re: Comparing counts of previous hour with the counts from the same day and hour of the previous 3 weeks

Ultra Champion
0 Karma
Highlighted

Re: Comparing counts of previous hour with the counts from the same day and hour of the previous 3 weeks

SplunkTrust
SplunkTrust

This generates some random test data...

| gentimes start="03/01/2017:00:00:00" end="03/30/2017:01:00:00" increment=10m 
| eval myurl="url1 url2 url3" | makemv myurl | mvexpand myurl 
| eval _time = starttime | eval value=10+random()%600 | table _time myurl value 
| bin _time  span=1h | stats sum(value) as hits by myurl _time

... something like this determines what timeframe you actually want to look at. This would select the entire prior day.

| eval todaystart=relative_time(now(),"-1d@d")
| eval todayend=relative_time(todaystart,"@d")

...or ...

| eval todaystart=strptime("2017-03-29","%Y-%m-%d")
| eval todayend=strptime("2017-03-30","%Y-%m-%d")

... and here we get to the actual code. We've selected data for at least the prior 3 weeks. We are going to add zero weeks, one week, two weeks, and three weeks to the _time of the event record and release a record into the future for each of those.

| eval weekwrap="0 1 2 3" | makemv weekwrap | mvexpand weekwrap 
| eval _time = _time + weekwrap*604800 
| eval week = case(weekwrap=0,"current",true(),"prior_".weekwrap)

... this throws away all the records that arrive that are out of the time frame we care about, then sums it up by hour (as requested) for each URL...

| where (_time < todayend) AND (_time>=todaystart) 
| eval Time = strftime(_time,"%H:%M")
| stats sum(hits) as hits by myurl Time week

... now we're going to add to those a record that calculates the average of the prior weeks....

| appendpipe [| where like(week,"p%") | stats avg(hits) as hits by myurl Time | eval week="prior_avg",hits=ceiling(hits)] 

... and then roll them all together into one record per url and timeframe

| eval {week} = hits
| stats sum(*) as * by myurl Time
| table myurl Time curr* prior*

... and finally, select only those which urls and hours have dropped below average

| where current < prior_avg

...resulting in output that looks approximately like this ...

myurl   Time      current     prior_1     prior_2     prior_3     prior_avg   
url1    03:00     1123        1859        1611        1731        1734        
url1    04:00     1687        2079        2015        2079        2058        
url1    06:00     1279        1959        2207        2223        2130        
url1    08:00     1151        1711        2727        2879        2439        
url1    17:00     1707        2075        1547        2187        1937        
url1    21:00     827         2283        1587        2099        1990        
url2    00:00     1841        1841        2289        2617        2249        
url2    03:00     1677        2061        1813        1621        1832        
url2    04:00     2137        1937        2529        2481        2316        
url2    05:00     1133        1517        1197        2333        1683        
url2    09:00     1533        1461        2301        2021        1928        
url2    10:00     1329        1673        1489        1489        1551        
url2    13:00     2021        1461        3021        1629        2037        
url2    14:00     1905        2057        1929        2193        2060        
url2    15:00     2149        1661        2725        2293        2227        
url2    16:00     1473        1801        1577        1473        1617        
url2    17:00     1341        1733        701         1733        1389        
url2    19:00     1573        1245        2109        1733        1696       

View solution in original post

Highlighted

Re: Comparing counts of previous hour with the counts from the same day and hour of the previous 3 weeks

SplunkTrust
SplunkTrust

One suggestion. "Less than average" is going to alert 50% of the time. You might want to consider alerting only when this hour's numbers are significantly less than average - the below version of the code would only alert when the numbers are at least two standard deviations below the norm-- about 5% of the time.

| where (_time < todayend) AND (_time>=todaystart) 
| eval Time = strftime(_time,"%H:00")
| stats sum(hits) as hits by myurl Time week
| appendpipe [| where like(week,"p%") | stats avg(hits) as hits, stdev(hits) as stdevhits by myurl Time | eval week=("prior_avg=".ceiling(hits)."!!!!prior_stdev=".ceiling(stdevhits))| table myurl Time week | makemv delim="!!!!" week |mvexpand week |makemv delim="=" week| eval hits=mvindex(week,1)| eval week=mvindex(week,0)] 
| eval {week} = hits
| stats sum(*) as * by myurl Time
| table myurl Time curr* prior*
| where current<prior_avg -2*prior_stdev
0 Karma
Highlighted

Re: Comparing counts of previous hour with the counts from the same day and hour of the previous 3 weeks

Ultra Champion

Thank you @daljeanis for calling out the flaws with analyzing averages! Percentiles and stddevs are great options!

0 Karma
Highlighted

Re: Comparing counts of previous hour with the counts from the same day and hour of the previous 3 weeks

SplunkTrust
SplunkTrust

You're welcome. I try to answer the question asked, and occasionally also the question that should have been asked. Obviously, you could add a prior_pct10 field using the same strategy inside the appendpipe and see how well it fits your data.

0 Karma
Highlighted

Re: Comparing counts of previous hour with the counts from the same day and hour of the previous 3 weeks

Ultra Champion

I'll use this to pimp my talk from .conf on a similar passion. Of course let me know if you have any tips I should adjust for the future: http://conf.splunk.com/files/2016/recordings/writing-actionable-alerts.mp4

0 Karma
Highlighted

Re: Comparing counts of previous hour with the counts from the same day and hour of the previous 3 weeks

New Member

Yes thank you. I'm 100% going to go the standard deviation route.

0 Karma
Highlighted

Re: Comparing counts of previous hour with the counts from the same day and hour of the previous 3 weeks

Explorer

Could anybody explain whats for it?

and release a record into the future for each of those.
| eval _time =
_time + weekwrap*604800

0 Karma