If I run the following search for the previous month, the number of days that appears next to Sunday is 8? If I look at the data, I do only see 5 different dates in the events, yet 8 different dates appear in the Time column. For example, an event with the time "20150830 21:47:17" shows "8/31/15 12:00:00.000 AM" in the time column, whereas an event with the time "20150830 22:01:39" shows "8/30/15
12:00:00.000 AM" in the Time column.
source="c:\\logs\\aaaa" | transaction bbbb startswith=("CCCC STARTED") endswith=("CCCC ENDED") |
bin span=1d _time | stats count dc(_time) as days by date_wday | eval average_count = count / days
Remember that _time
is normalized - it considers both the timezone of the host and the timezone that you have set in your user settings. date_wday
is not normalized in any way - it is simply the day of the week that is associated with the raw datetime information in the event. This difference can certainly cause anomalies in your results.
The bin
command sets the _time
timestamp to the start of the day, so all _time
will have a time "12:00:00.000 AM".
I think that a better way to do this would be:
source="c:\\logs\\aaaa"
| transaction bbbb startswith=("CCCC STARTED") endswith=("CCCC ENDED")
| eval day_of_week = strftime(_time,"%A")
| bin span=1d _time
| stats count dc(_time) as days by day_of_week
| eval average_count = count / days
Now all your calculations are based on the normalized timestamp.
Remember that _time
is normalized - it considers both the timezone of the host and the timezone that you have set in your user settings. date_wday
is not normalized in any way - it is simply the day of the week that is associated with the raw datetime information in the event. This difference can certainly cause anomalies in your results.
The bin
command sets the _time
timestamp to the start of the day, so all _time
will have a time "12:00:00.000 AM".
I think that a better way to do this would be:
source="c:\\logs\\aaaa"
| transaction bbbb startswith=("CCCC STARTED") endswith=("CCCC ENDED")
| eval day_of_week = strftime(_time,"%A")
| bin span=1d _time
| stats count dc(_time) as days by day_of_week
| eval average_count = count / days
Now all your calculations are based on the normalized timestamp.
Ah, thanks much. That works fine. My remaining problem is that I added the following line
eventstats avg(average_count)
to get the average number of items per day, but the numbers are skewed because of the inclusion of Saturdays and Sundays which I have been asked to exclude. Guess I should post a separate question for this though.
That's pretty easy, after the first eval, add | where day_of_week != "Saturday" AND day_of_week != "Sunday"
to eliminate those days.