I have a number of events, received from bluecoat proxies, in which the _indextime field is earlier than the _time field.
Clearly something cannot be indexed before it has even occurred (I haven't installed the 'crystal ball' app).
The bluecoat proxies are running GMT (timezone seen in the raw event) and all servers in our Splunk system are running BST (GMT+1), set via NTP. I've confirmed the time settings on all of them and i'm aware that Splunk Web automatically adjusts the times based on user profile settings which I've also confirmed as GMT.
The logs are received on a heavy forwarder which is using a file monitor.
The files are received every 15mins on the heavy forwarder.
I've checked the timestamps on some of the files which contain the events. These are as expected.
I've also noticed that for these seemingly impossible events, 95% of them seem to come from a single indexer (we have four).
We don't appear to be replicating the data as the same search returns gaps in the data if we exclude that single indexer.
We haven't set manual _indextime stamps for this sourcetype.
Btool shows no issues that i can see with the configs on this single indexer.
The problem is sporadic, it doesn't happen all the time.
There is no pattern to the timings of the events, they do not appear at times when that indexer is under load from indexing or retrieving search data)
Here's a simplified version of the search i'm using to identify these things:
index=bluecoat sourcetype=bluecoat_G | eval difference=(_indextime - _time)/60 | eval indextime=strftime(_indextime,"%Y-%m-%d %H:%M:%S") | bucket _time span=1h | stats max(indextime) min(indextime) max(difference) min(difference) by _time splunk_server
I suspect this is causing issues for our summarization data (which a lot of our alerts are based on).
The _indextime is when it was indexed. the _time is the datetime stamp of the data. Your bluecoat is putting datetime stamps that are in the future.
Check your timezone settings on your sourcetypes, nntp settings, etc.
You could also add an indexlatest=-1h@h to your searches. To get events that are 1 hour in the future.
Ideally, everything would be set to the same time zone but not necessary so long as you tell Splunk what timezone each host, source, or sourcetype is in. You can adjust the time zone settings for that host/source/sourcetype by putting a props.conf on the HF to tell Splunk what timezone that sourcetype is in:
[bluecoat_G] TZ = UTC