I'm looking to get some summary statistics by date_hour on the number of distinct users in our systems.
Given a data set that looks like:
OCCURRED_DATE=10/1/2016 12:01:01; USERNAME=Person1
OCCURRED_DATE=10/1/2016 12:02:01; USERNAME=Person1
OCCURRED_DATE=10/1/2016 12:02:05; USERNAME=Person2
OCCURRED_DATE=10/2/2016 12:01:01; USERNAME=Person1
OCCURRED_DATE=10/2/2016 12:02:01; USERNAME=Person1
I know that the dc(USERNAME) for the 12:00 date_hour for 10/1/2016 is 2 and that the dc(USERNAME) for the 12:00 date_hour for 10/2/2016 is 1. I'd like to be able to have Splunk give me the average of those days worth of date_hours (i.e. 1.5)
I've tried several different iterations of the below without any success.
sourcetype=usage | timechart span=1h dc(USERNAME) as user_count | stats avg(user_count) by date_hour | sort date_hour
My original attempt included the below, which also doesn't produce results.
sourcetype=usage | stats avg(dc(USERNAME)) by date_hour | sort date_hour
Timechart is losing the builtin special date fields. If you really want to average it across the same hour for all days, you could use eval's strftime() to re-calculate the hour:
sourcetype=usage | timechart span=1h dc(USERNAME) as dc | eval hour=strftime(_time, "%H") | stats avg(dc) by hour
Timechart is losing the builtin special date fields. If you really want to average it across the same hour for all days, you could use eval's strftime() to re-calculate the hour:
sourcetype=usage | timechart span=1h dc(USERNAME) as dc | eval hour=strftime(_time, "%H") | stats avg(dc) by hour