Hi All;
Noticed something very interesting and I don't seem to find the smoking gun. Today I was alerted by one of my users that he saw some duplicate events, more recently today at 2:20pm. I took a look, and sure enough I was able to find that an event was duplicated. I narrowed the result down to the following search:
index=app_harmony sourcetype=harmony:\*:access "[04/Jan/2017:14:20*" host=cimasked0047 | eval idxtime=_indextime | table _time idxtime host source splunk_server _raw
The results were as follows:
_time idxtime host source splunk_server _raw
2017-01-04 14:20:02 1483557602 hostname /usr/local/openresty/nginx/logs/access.log splunkindex0006 xx.xx.xxx.xxx - - [04/Jan/2017:14:20:02 -0500] "GET /api/harmony/v1/User/G_IAS_6e680df53be6a06f9e11faf40812dc8c?domain=internal HTTP/1.1" 200 317 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.95 Safari/537.36" "-"
2017-01-04 14:20:02 1483557602 hostname /usr/local/openresty/nginx/logs/access.log splunkindex0009 xx.xx.xx.xxx - - [04/Jan/2017:14:20:02 -0500] "GET /api/harmony/v1/User/G_IAS_6e680df53be6a06f9e11faf40812dc8c?domain=internal HTTP/1.1" 200 317 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.95 Safari/537.36" "-"
The raw event is exactly the same, as is the index time. The only difference between the two events is our indexer. We have a cluster of 5 indexers with a RF of 3.
I double checked the file, and there isn't more than 1 event at that time.
I also verified that I dont see any random '.filepart' or temp files or anything appearing
the input for that particular file is pretty strict and as follows:
[monitor:///usr/local/openresty/nginx/logs/access.log]
index = app_harmony
sourcetype = harmony:openresty:access
I took a look at the internal logs, and didn't see any errors around that time.
Actually, you have an unnecessary second tcpout stanza in there. I don't know if that's the problem, but you should remove one tcpout stanza.
Do you use a master server? If so, change your tcpout stanzas to something like this (and the others settings you'd like to have):
[tcpout]
defaultGroup = companyPVSNewIndexers
indexAndForward = false
[tcpout:companyPVSNewIndexer]
indexerDiscovery = YourIndexerCluster
//additional settings for SSL required when enabled
[indexer_discovery:YourIndexerCluster]
pass4SymmKey = YourKey //defined on master
master_uri = https://YourMasterServer:8089
If you don't use a master (I don't think this is recommended), just merge both tcpout stanzas into one. Not sure about the autoLB settings though.
Edit: typo
Does the log roll at this particular time ? Do you see any mentions of CRC error in the splunkd.log of the forwarder and a mention of re-reading the file ?
The file doesn't roll at all at the moment, not since december at least.
As far as any messages in the log, nothing other than the typical:
01-04-2017 14:19:49.884 -0500 INFO TcpOutputProc - Connected to idx=****:9997 using ACK.
01-04-2017 14:20:29.781 -0500 INFO TcpOutputProc - Closing stream for idx=****:9997
01-04-2017 14:20:29.781 -0500 INFO TcpOutputProc - Connected to idx=****.108:9997 using ACK.
01-04-2017 14:20:41.640 -0500 INFO HttpPubSubConnection - Running phone uri=/services/broker/phonehome/connection_****_8089_****_****_268231C9-FA74-4B0E-8BE7-3A6C4AD83F2E
01-04-2017 14:21:09.647 -0500 INFO TcpOutputProc - Closing stream for idx=****:9997
01-04-2017 14:21:09.647 -0500 INFO TcpOutputProc - Connected to idx=****:9997 using ACK.
01-04-2017 14:21:41.645 -0500 INFO HttpPubSubConnection - Running phone uri=/services/broker/phonehome/connection_****_8089_****_****_268231C9-FA74-4B0E-8BE7-3A6C4AD83F2E
01-04-2017 14:21:49.498 -0500 INFO TcpOutputProc - Closing stream for idx=****:9997
01-04-2017 14:21:49.498 -0500 INFO TcpOutputProc - Connected to idx=****:9997 using ACK.