I have a source that only contains the time of an event, not the date. It looks something like this:
...
08:26:40 event1
08:26:41 event2
13:59:09 event3
13:59:12 event4
...
The order in the source is not by time but rather grouped by application specifics. When I try to index this splunk correctly recognizes the time based on my TIMESTAMP_FORMAT, but for each large skip forward as with event2 to event3 it skips back a day, so event2 gets inserted as April 19th 08:26:41 (correct), but event3 gets inserted as April *18*th 13:59:09. Fiddling with MAX_DIFF_SECS_AGO / _HENCE does not appear to help.
Any ideas?
Edit: I've dug through comparing an example file with the events in splunk, here's where the timestamp jumps occur:
Event Source Splunk
1 18:23:50 18:23:50 April 18th
2 07:16:22 07:16:22 April 17th, jumped back one day
3 07:16:24 07:16:24 April 17th
...
754 08:49:08 08:49:08 April 17th
755 08:26:41 08:26:41 April 17th
756 13:59:09 13:59:09 April 16th, jumped back one day
757 13:59:12 13:59:12 April 16th
...
817 14:15:38 14:15:38 April 16th
818 08:27:35 08:27:35 April 16th, did not jump
...
The jumps don't seem to follow a simple pattern. Event 1 to 2 was a time-gap backwards by 11 hours, this caused a jump. 755 to 756 was 5 hours forwards, jump again. 817 to 818 was backwards as 1 to 2 and more hours than 755 to 756, but no jump...
Not having the date in the event means there's no guarantee's Splunk is going to get the date right. It also may be looking at something else in the event to determine the date.
I suggest you set DATETIME_CONFIG = CURRENT in the props.conf and let splunk assign the timestamp.
Brian
I'm having the same issue. With the exception that I have a date at the top of the file. The events are labeled with a time. It seems that once Splunk has correctly identified the date in a file, it should use that date until another date is given. Here's test data that you can use to reproduce the issue.
When the event jumps from hour 00 to hour 05, splunk changes the date to 1 day earlier? Which is a bit strange, as I don't know of any log that would write date's out of sequence.
18/04/2013 - 00:33:51.828 - LOG OPENED - CHANGE OF DATE
00:33:51.828 Adding dub job 263851 ...
00:33:51.906 Trying to add
00:59:15.281 Adding dub job 263853 ...
00:59:15.359 Trying to add
00:33:51.828 Adding dub job 263851 ...
05:59:15.359 Trying to add
As for what kind of log would create events out of sequence... no log at all. This was a prettified-for-viewing report from SAP that had events out of sequence because there was additional grouping. Now we're pulling raw data from underneath, feels much better as well...
Good to hear. I've checked to see if I can do the same, and the answer was no.
Were you ever able to solve this problem? I have the same problem in that I can't include the date in the source of incoming data
Thanks... I've solved my issue by simply making the providers of the data create a properly sorted and fully date-/timestamped source 🙂
I've added a bit more sample time data in case someone sees a pattern that I'm missing.
For now the plan is to have a date inserted before pulling the data into Splunk.
Not having the date in the event means there's no guarantee's Splunk is going to get the date right. It also may be looking at something else in the event to determine the date.
I suggest you set DATETIME_CONFIG = CURRENT in the props.conf and let splunk assign the timestamp.
Brian
I'm making my data provider include the dates, too many skips in the time alone.
I'll probably only get a batch of data once a day, so using the index time isn't going to work as long as there is no near-real-time connection. There are jumps back in time in the files as well, those don't seem to annoy splunk too much though...
As for looking at something else in the event, the timestartpos and timeendpos are correct for every event, there are no occasional odd values - are those reliable for making sure nothing else is influencing the timestamp decision?