Splunk Search

Why is time column different from date in transaction?

RVDowning
Contributor

If I run the following search for the previous month, the number of days that appears next to Sunday is 8? If I look at the data, I do only see 5 different dates in the events, yet 8 different dates appear in the Time column. For example, an event with the time "20150830 21:47:17" shows "8/31/15 12:00:00.000 AM" in the time column, whereas an event with the time "20150830 22:01:39" shows "8/30/15
12:00:00.000 AM" in the Time column.

source="c:\\logs\\aaaa" | transaction bbbb startswith=("CCCC STARTED") endswith=("CCCC ENDED") | 
bin span=1d _time | stats count dc(_time) as days by date_wday | eval average_count = count / days
0 Karma
1 Solution

lguinn2
Legend

Remember that _time is normalized - it considers both the timezone of the host and the timezone that you have set in your user settings. date_wday is not normalized in any way - it is simply the day of the week that is associated with the raw datetime information in the event. This difference can certainly cause anomalies in your results.

The bin command sets the _time timestamp to the start of the day, so all _time will have a time "12:00:00.000 AM".

I think that a better way to do this would be:

source="c:\\logs\\aaaa" 
| transaction bbbb startswith=("CCCC STARTED") endswith=("CCCC ENDED") 
| eval day_of_week = strftime(_time,"%A")
| bin span=1d _time
| stats count dc(_time) as days by day_of_week
| eval average_count = count / days

Now all your calculations are based on the normalized timestamp.

View solution in original post

0 Karma

lguinn2
Legend

Remember that _time is normalized - it considers both the timezone of the host and the timezone that you have set in your user settings. date_wday is not normalized in any way - it is simply the day of the week that is associated with the raw datetime information in the event. This difference can certainly cause anomalies in your results.

The bin command sets the _time timestamp to the start of the day, so all _time will have a time "12:00:00.000 AM".

I think that a better way to do this would be:

source="c:\\logs\\aaaa" 
| transaction bbbb startswith=("CCCC STARTED") endswith=("CCCC ENDED") 
| eval day_of_week = strftime(_time,"%A")
| bin span=1d _time
| stats count dc(_time) as days by day_of_week
| eval average_count = count / days

Now all your calculations are based on the normalized timestamp.

0 Karma

RVDowning
Contributor

Ah, thanks much. That works fine. My remaining problem is that I added the following line

eventstats avg(average_count)

to get the average number of items per day, but the numbers are skewed because of the inclusion of Saturdays and Sundays which I have been asked to exclude. Guess I should post a separate question for this though.

0 Karma

lguinn2
Legend

That's pretty easy, after the first eval, add | where day_of_week != "Saturday" AND day_of_week != "Sunday"
to eliminate those days.

0 Karma
Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...