Solved: Report hourly max count events per day over a mont...

egrignon · ‎08-07-2012

Hello,

I m trying to get the hour per day which gets the most hits on my application over a month but having some issues to get the right data output.

I would like to get a table report which would have:

DAY1 HOURX MaxEventNumber

DAY2 HOURX MaxEventNumber

....

I tried the following queries but none of them work. I think I m missing something on splunk reporting concepts:

| timechart span=1h count as HourlyCount | timechart span=1d max(HourlyCount)
| stats count AS hit BY date_hour, date_mday | stats max(hit) BY date_hour, date_mday
| timechart span=1h count | convert timeformat="%D" ctime(_time) AS c_time | chart max(count) AS MaxPerDay by c_time | table MaxPerDay date_hour

Any help would be greatly appreciated,

Thanks,

EG

lguinn2 · ‎08-07-2012

Try this:

| stats count as hit by date_hour, date_mday 
| eventstats max(hit) as maxhit by date_mday 
| where hit=maxhit
| fields - maxhit

I am not sure it will work. But it should figure out the max hits for each day, and only keep the events with that have have the maximum number.

New and better solution, which works for for all events - not just those with date_* fields:

yoursearchhere
| timechart count as hit span=1h
| eval date=strftime(_time, "%x")
| eventstats max(hit) as maxhit by date
| where hit=maxhit
| fields - maxhit

View solution in original post

dwaddle · ‎01-02-2014

mmm, old post!

I like to use a slightly different approach. It is very clear, and pretty cheap too. You can pipe one timechart into another, using different time resolutions. Like:

sourcetype=access_combined GET OR POST 
| timechart span=1h count as count 
| timechart span=1d max(count) as max

The first timechart gives you a count by hour, piped into the second one which pulls out only the biggest hour per day. The part you lose is what hour of the day was the busy hour. If which hour was the busiest hour is not important to you, then this is a quick and easy approach.

lguinn2 · ‎01-02-2014

I like this. Also, I just realized that I would give a different answer at this point!

cygnetix · ‎07-22-2015

Simple, yet effective. Thanks.

egrignon · ‎08-08-2012

Thank you very much both of you.

It seems both solutions work. Do you know which method is best practice or best from a performance perspective?

EG

lguinn2 · ‎08-08-2012

With multiple indexers and/or lots of data, I think that eventstats may be slightly more efficient.

You can always run both and compare - use the Search Job Inspector (the 'i' button in 4.3, and under the Actions menu in earlier versions).

sideview · ‎08-07-2012

restating your problem to help clarify. I think you want to end up with a table where each day is only represented by one row, and that row represents the busiest hour during that day.

where the fields in the table are 'day of month', 'hour of day' and 'event count'

Here's the search you need:

<your search terms> | stats count by date_hour date_mday | sort - date_mday count | streamstats count as perMdayCount by date_mday | where perMdayCount=1

it's weird so lets walk it through step by step. You can run the query piece by piece and look at what the results look like at each pipe character (this is the best way to understand every complex search).

The stats clause gives you the count of events rolled up by unique combinations of day-of-month and hour-of-day.

Next the sort clause sorts it primarily by the day of month. But then within those the highest counts will be first in that subset.

Next we use streamstats count, with a very important by date_mday, to actually paint little numbers on each event. If streamstats with a by clause makes your head hurt that's totally normal. Just play with it and you'll understand what it does. In this case since within each subset of a given day-of-month we just sorted the date_hours such that the highest count hour ends up first in the set, these highest count hours all get painted with a 'perMdayCount' field whose value is "1".

Which ends up being marvelously convenient because now we can just use a where clause to match only rows where perMdayCount is "1", and this conveniently throws away all the hours that were not the highest count hour in that given day.

lguinn2 · ‎08-07-2012

Try this:

| stats count as hit by date_hour, date_mday 
| eventstats max(hit) as maxhit by date_mday 
| where hit=maxhit
| fields - maxhit

I am not sure it will work. But it should figure out the max hits for each day, and only keep the events with that have have the maximum number.

New and better solution, which works for for all events - not just those with date_* fields:

yoursearchhere
| timechart count as hit span=1h
| eval date=strftime(_time, "%x")
| eventstats max(hit) as maxhit by date
| where hit=maxhit
| fields - maxhit

splunk_zen · ‎01-24-2013

Iguinn,
I've a similar requirement but having trouble joining the search and subsearch fields with the where clause.
Can you please take a look at http://splunk-base.splunk.com/answers/72928/subsearch-busy-hour-cpu-usage ?

ambujhbti · ‎06-25-2015

Thank you. This help me with my work ! Cheers 🙂

Report hourly max count events per day over a month

Observe and Secure All Apps with Splunk

Splunk Decoded: Business Transactions vs Business IQ

Fastest way to demo Observability

Are you a member of the Splunk Community?

Report hourly max count events per day over a month

Observe and Secure All Apps with Splunk

Splunk Decoded: Business Transactions vs Business IQ

Fastest way to demo Observability