I'm a Splunk newbie, so I spent some time with the documentation and this forum, learning how to use this powerful tool and managed to construct some very nice searches. My question is as follows:
I have an index with Actions and Users. At first, I wanted to count the number of Actions by User for each day, while also showing what day that was, so this worked fine -
index="my_index" | bucket span=1d _time | stats count as countActions by User, _time, date_wday | sort User, date_wday
Then, I wanted to find the average number of Actions per User per Weekday. The naive approach of -
index="my_index" | bucket span=1d _time | stats count as countActions by User, _time, date_wday | sort User, date_wday | stats avg(countActions) as Weekday_average by User, date_wday
gives incorrect results, since it doesn't account for all those days when there were no Actions. How can I add those days (of zero values) to calculate the correct average?
And another question: can I change the default "1w" definition so that a week will start on Sunday, instead of Monday (for example, for a " | bucket span=1w _time | " clause)?
To produce 0 values I would recommend the following approach:
Step 1: Create a lookup
## expected_weekdays.csv date_wday,count sunday,0 monday,0 tuesday,0 wednesday,0 thursday,0 friday,0 saturday,0 ## transforms.conf [expected_weekday_lookup] filename = expected_weekdays.csv
Step 2. Use the lookup in your search
index="my_index" | bucket span=1d _time | stats count as countActions by User, _time, date_wday | append[| inputlookup append=T expected_weekday_lookup | rename count as countActions] | sort User, date_wday | stats avg(countActions) as Weekday_average by User, date_wday
This doesn't seem to work (I tried several combinations and variations of this).
Since commenting is limited in characters, I'll elaborate in a separate answer.
I'll try to present my problem on the following sample set:
01.05.11, User=A, Action=X
02.05.11, User=A, Action=X
03.05.11, User=A, Action=X
04.05.11, User=B, Action=X
08.05.11, User=A, Action=X
09.05.11, User=A, Action=X
09.05.11, User=B, Action=X
09.05.11, User=A, Action=X
With 01.05.11 being a Sunday, what we have over two weeks time is:
User A - 1 Action on first and second Sunday, 1 Action on first Monday, 1 Action on first Tuesday, 2 Actions on second Monday
User B - 1 Action on first Wednesday, 1 Action on second Monday
The first step of my search (index="my_index" | bucket span=1d _time | stats count as countActions by User, _time, date_wday | sort User, date_wday) calculates the number of Actions for every date per User and assigns the Weekday of that date. Now I want to calculate the averages.
The result (of averages per user per weekday over two weeks) should be:
A Sunday 1
A Monday 1.5
A Tuesday 0.5
A Wednesday 0
A Thursday 0
A Friday 0
A Saturday 0
B Sunday 0
B Monday 0.5
B Tuesday 0
B Wednesday 0.5
B Thursday 0
B Friday 0
B Saturday 0
How can this be done?