Today, I noticed that, when performing a basic search, the events are not sorted chronologically. Additionally, not all events 'match up' correctly to the timeline.
I have not found any other posts which document this strange behavior.
With a simple | sort _time, the events sort as expected and correlate to the timeline accurately.
The deployment was upgraded from 7.0.2 to 7.1.2 one week ago.
Here's some screenshots that show the behavior:
Does anyone have any ideas how to fix this issue?
Yeah, we noticed the same thing on our side. Opened a ticket with Splunk and they confirmed that it was a known bug:
This behavior has been reported as a bug on a Jira (SPL-154973) document, and has been fixed on version 7.1.3.
And the official workaround is to explicitly sort the events.
Yeah, we noticed the same thing on our side. Opened a ticket with Splunk and they confirmed that it was a known bug:
This behavior has been reported as a bug on a Jira (SPL-154973) document, and has been fixed on version 7.1.3.
And the official workaround is to explicitly sort the events.
Incidentally (and I'm not sure if this is a symptom of the same issue), I've actually had the search results show the same events multiple times (as though they were different events). As before, explicit sorting fixes this.
Earlier this morning I noticed some other similar aberrant behaviors related to events showing up multiple times when zooming into different slices of the timeline. Similarly I don't know if it's a symptom of the same issue but it seems likely. I will accept your answer when things get straightened out. Thank you!
I can't edit my own post... Correction: The version was upgraded to 7.1.2.
 
		
		
		
		
		
	
			
		
		
			
					
		It's certainly a little strange.      In general if you have  searchterms | <some transforming command>   the order of the events going into the transforming command are not actually guaranteed to be in time order.     (Yes it used to be true long long ago, but with distsearch and search-in-separate-process and various parallel bucket things they did,  it's no longer always true) 
HOWEVER why am I talking about transforming commands?  You're seeing this happen in a simple events search.   Yes, I am surprised.
I suspect that it's something you don't normally see unless the timestamps on the events are a little different from the actual wall-clock-time when they come into the system?   Is there anything else notable about those events whose timestamps are off from the others? 
I should also mention that Transparent Huge Page memory management was disabled one day ago on all (Linux) hosts across the cluster.
The timestamps are coming in as Unix Epoch time and are extracted correctly. I've checked the difference between the _indextime and _time and there are no events for which the skew is greater than 4-5 seconds. I'd be glad to share a couple of examples of the data that is being ingested and the related props configuration if you think that might shed some light on the matter. Thanks!
I should mention that the timestamps resolve down to milliseconds...
