How to add based on the 4 hour timeperiods

Motivator

Hello

I am trying to create a solution which compares 4 hour counts of a day with a average of the same 4hour period of time across last 60 days.

``````_time                   Org   Count

8/19/13 12:00:00 AM      A     5
8/19/13 4:00:00 AM       A     5
8/19/13 8:00:00 AM       A     5
8/19/13 12:00:00 PM      A     5
8/19/13 8:00:00 PM       A     5
8/18/13 12:00:00 AM      A     5
8/18/13 4:00:00 AM       A     5
8/18/13 8:00:00 AM       A     5
8/18/13 4:00:00 PM       A     5
8/18/13 8:00:00 PM       A     5
8/17/13 8:00:00 PM       A     5
``````

As you can see, its not necessary that it will output for all 6 time cycles for a day. Can anyone please help me on creating a logic which can help me in doing mathematical calculations based on the timeperiods. It needs to add 8/19/13 12:00:00 AM with 18 12:00:00 AM, 17th 12:00:00 AM and so on.

Thanks
theou

Tags (1)
Champion

It is not good English. I'm sorry if I wrong.
I tried by aggregating every 4 hours to make the data in timechart.

(ex.)
earliest=-60d@d latest=@d |timechart span=1h count |eval Tgtime=tostring(floor(tonumber(strftime(_time,"%H"))/4)*4)|stats sum(count) as last60days by Tgtime|eval last60days=last60days/60| join [search earliest=-d@d latest=@d |timechart span=1h count |eval Tgtime=tostring(floor(tonumber(strftime(_time,"%H"))/4)*4)|stats sum(count) as daybefore by Tgtime]

Results

Explorer

I can't speak to your exact 4 hour need, however, I can show you how I did it minute by minute. I think you should be able to combine what was outlined by dwaddle above with what I have below. You can tweak it as well if you don't want need to calculate an entire weeks worth of data at time.

Summary Index: runs every minute.
``` _time User_Count_Per_Minute```

8/19/13 12:00:00 AM 5
8/19/13 12:01:00 AM 6
8/19/13 12:02:00 AM 6

Everynight, I have a CSV file that caclulates the previous days averages and is used throughout the day. (It actually creates an entire weeks worth of data. This is incase I want to run ad-hoc queries against a day other than the current day)

Runs @ 12:05AM every night

CSV Search
``` index=your_summary source="your_summary_index" earliest=-30d@d latest=@d | eval equalized_time = strftime(_time,"%A %H:%M) | stats avg(User_Count_Per_Minute) AS Per_Minute_Average by equalized_time | table equalized_time Per_Minute_Average | outputlookup 30_day_user_average.csv ```

Results (Ends up with about 10K rows)
``` equalized_time Per_Minute_Average Fri 5:05 30 Fri 5:06 95 .... Fri 23:59 13 ```

Dashboard View - Used to graph the last 3 hours of activity
``` index=your_summary source="your_summary_index" earliest=-3h@h latest=-1m@m | eval equalized_time = strftime(_time,"%A %H:%M) | fields + _time equalized_time User_Count_Per_Minute | JOIN equalized_time [ |inputlookup 30_day_user_average.csv | fields + equalized_time Per_Minute_Average ] | table _time equalized_time User_Count_Per_Minute Per_Minute_Average ```

It is also possible to use appendcols which is how my search orginally started before I used a summary index. The appendcols only works after a stats aggreagtion command which is no longer present because it is performed in the summary index.

Super Champion

Sounds like you need a summary index, but I don't know anything about summary indexes... yet...

SplunkTrust

Not too difficult. Let's approach in pieces. I'll assume for now that you have data over the whole time range. We'll start by aggregating by each hour, 00 - 24, like this:

``````sourcetype=yoursourcetype | eval hour_of_day=strftime(_time,"%H")
| stats count by hour_of_day
``````

These have all of the "7" hours aggregated together over your search range, as well as all of the "11" hours and so on. Now, we need to break these 24 different 1-hour blocks into 4 hours blocks. We can use `rangemap` to start with, in order to assign each of the 24 1-hours a 'range'

``````sourcetype=yoursourcetype | eval hour_of_day=strftime(_time,"%H")
| stats count by hour_of_day
| rangemap field=hour_of_day 00=00-03 04=04-07 08=08-11 12=12-15 16=16-19 20=20-23
``````

So now, we can use that new `range` to summarize into 4-hour blocks. We'll use `stats` again, but this time instead of counting we'll add up the previously computed counts. We remove the old 'hour' field and replace it with the value of 'range'.

``````sourcetype=yoursourcetype | eval hour_of_day=strftime(_time,"%H")
| stats count by hour_of_day
| rangemap field=hour_of_day 00=00-03 04=04-07 08=08-11 12=12-15 16=16-19 20=20-23
| fields - hour_of_day
| rename range as hour_of_day
| stats sum(count) as count by hour_of_day
``````

And that should work...

Motivator

I already have a 4 hour summary as the threshold will be on a 4h period. Will try it, but I think this will work.

SplunkTrust

Well, in that case .. I would probably keep each hourly count in a summary index, and then when I needed it hit up against the summary, and aggregate the data from the summary into 4-hour blocks when I needed it. That way, I have a little flexibility for if I need 2-hour hour averages or 5-hour averages or what have you - the summary supports any of them at a reasonable level of summarization. Like Gerald says, "Summary Indexing is a design decision"

Super Champion

Very nice, but how do you keep track of 60 days worth of data so that you can calculate an average for each 4 hour block over those days.

SplunkTrust

Indeed! I had to get on a conf call, and clicked 'post' too soon. Working on finishing it 🙂

Ultra Champion

duck, that does not look really complete.

Get Updates on the Splunk Community!

Enter the Splunk Community Dashboard Challenge for Your Chance to Win!

The Splunk Community Dashboard Challenge is underway! This is your chance to showcase your skills in creating ...

.conf24 | Session Scheduler is Live!!

.conf24 is happening June 11 - 14 in Las Vegas, and we are thrilled to announce that the conference catalog ...

Introducing the Splunk Community Dashboard Challenge!

Welcome to Splunk Community Dashboard Challenge! This is your chance to showcase your skills in creating ...