Getting Data In
Highlighted

Why Was My Event Indexed Twice? Same Host, Same File, Same _raw, Same Indextime, different splunk_server

Builder

Hi All;

Noticed something very interesting and I don't seem to find the smoking gun. Today I was alerted by one of my users that he saw some duplicate events, more recently today at 2:20pm. I took a look, and sure enough I was able to find that an event was duplicated. I narrowed the result down to the following search:

index=app_harmony sourcetype=harmony:\*:access "[04/Jan/2017:14:20*" host=cimasked0047 | eval idxtime=_indextime | table _time idxtime host source splunk_server _raw

The results were as follows:

_time   idxtime host    source  splunk_server   _raw
2017-01-04 14:20:02 1483557602  hostname    /usr/local/openresty/nginx/logs/access.log  splunkindex0006 xx.xx.xxx.xxx - - [04/Jan/2017:14:20:02 -0500] "GET /api/harmony/v1/User/G_IAS_6e680df53be6a06f9e11faf40812dc8c?domain=internal HTTP/1.1" 200 317 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.95 Safari/537.36" "-"

2017-01-04 14:20:02 1483557602  hostname    /usr/local/openresty/nginx/logs/access.log  splunkindex0009 xx.xx.xx.xxx - - [04/Jan/2017:14:20:02 -0500] "GET /api/harmony/v1/User/G_IAS_6e680df53be6a06f9e11faf40812dc8c?domain=internal HTTP/1.1" 200 317 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.95 Safari/537.36" "-"

The raw event is exactly the same, as is the index time. The only difference between the two events is our indexer. We have a cluster of 5 indexers with a RF of 3.

I double checked the file, and there isn't more than 1 event at that time.

I also verified that I dont see any random '.filepart' or temp files or anything appearing

the input for that particular file is pretty strict and as follows:

[monitor:///usr/local/openresty/nginx/logs/access.log]
index = app_harmony
sourcetype = harmony:openresty:access

I took a look at the internal logs, and didn't see any errors around that time.

Highlighted

Re: Why Was My Event Indexed Twice? Same Host, Same File, Same _raw, Same Indextime, different splunk_server

Splunk Employee
Splunk Employee

Does the forwarder have a useACK=true in the outputs.conf?

One of the known side effects of this setting can be data duplication in rare cases.

Search for this message in the _internal splunkd.log on the forwarder:

WARN TcpOutputProc - Possible duplication of events with

0 Karma
Highlighted

Re: Why Was My Event Indexed Twice? Same Host, Same File, Same _raw, Same Indextime, different splunk_server

Builder

Thanks for the quick response. I checked for that yesterday around the time of the duplication, and didn't see anything. Also tried a week long search to verify.

0 Karma
Highlighted

Re: Why Was My Event Indexed Twice? Same Host, Same File, Same _raw, Same Indextime, different splunk_server

SplunkTrust
SplunkTrust

Does the log roll at this particular time ? Do you see any mentions of CRC error in the splunkd.log of the forwarder and a mention of re-reading the file ?

0 Karma
Highlighted

Re: Why Was My Event Indexed Twice? Same Host, Same File, Same _raw, Same Indextime, different splunk_server

Builder

The file doesn't roll at all at the moment, not since december at least.

As far as any messages in the log, nothing other than the typical:

01-04-2017 14:19:49.884 -0500 INFO  TcpOutputProc - Connected to idx=****:9997 using ACK.
01-04-2017 14:20:29.781 -0500 INFO  TcpOutputProc - Closing stream for idx=****:9997
01-04-2017 14:20:29.781 -0500 INFO  TcpOutputProc - Connected to idx=****.108:9997 using ACK.
01-04-2017 14:20:41.640 -0500 INFO  HttpPubSubConnection - Running phone uri=/services/broker/phonehome/connection_****_8089_****_****_268231C9-FA74-4B0E-8BE7-3A6C4AD83F2E
01-04-2017 14:21:09.647 -0500 INFO  TcpOutputProc - Closing stream for idx=****:9997
01-04-2017 14:21:09.647 -0500 INFO  TcpOutputProc - Connected to idx=****:9997 using ACK.
01-04-2017 14:21:41.645 -0500 INFO  HttpPubSubConnection - Running phone uri=/services/broker/phonehome/connection_****_8089_****_****_268231C9-FA74-4B0E-8BE7-3A6C4AD83F2E
01-04-2017 14:21:49.498 -0500 INFO  TcpOutputProc - Closing stream for idx=****:9997
01-04-2017 14:21:49.498 -0500 INFO  TcpOutputProc - Connected to idx=****:9997 using ACK.
0 Karma
Highlighted

Re: Why Was My Event Indexed Twice? Same Host, Same File, Same _raw, Same Indextime, different splunk_server

Builder

I took another look and did notice that I do uave 'useACK' as true. This is a default on all of our forwarders:

[tcpout]
defaultGroup = companyPVSNewIndexers
# These two options below are required for forwarders when clustering.
# Max queue size ensures that the forwarder has enough of a buffer while
# waiting for the ACK from the indexer; without useACK, the search head
# will spout yellow warning banners in a clustered environment.
maxQueueSize = 7MB
useACK = true

Should I try running without useACK for this one forwarder?

0 Karma
Highlighted

Re: Why Was My Event Indexed Twice? Same Host, Same File, Same _raw, Same Indextime, different splunk_server

SplunkTrust
SplunkTrust

maybe it's worth checking the setting for the servers in outputs.conf as well, because if they are wrong this can end in duplicated events as well.

0 Karma
Highlighted

Re: Why Was My Event Indexed Twice? Same Host, Same File, Same _raw, Same Indextime, different splunk_server

Builder

So here is the whole outputs file. To save space on this post I have removed all the settings that have been commented out.

Company BASE SETTINGS

[tcpout]
defaultGroup = companyPVSNewIndexers
# These two options below are required for forwarders when clustering.
# Max queue size ensures that the forwarder has enough of a buffer while
# waiting for the ACK from the indexer; without useACK, the search head
# will spout yellow warning banners in a clustered environment.
maxQueueSize = 7MB
useACK = true

[tcpout]
defaultGroup = companyPVSNewIndexers
indexAndForward = false

# When indexing a large continuous file that grows very large, a universal
# or light forwarder may become "stuck" on one indexer, trying to reach
# EOF before being able to switch to another indexer. The symptoms of this
# are congestion on *one* indexer in the pool while others seem idle, and
# possibly uneven loading of the disk usage for the target index.
# In this instance, forceTimebasedAutoLB can help!
# ** Do not enable if you have events > 64kB **
forceTimebasedAutoLB = true

# Correct an issue with the default outputs.conf for the Universal Forwarder
# or the SplunkLightForwarder app; these don't forward _internal events.
forwardedindex.2.whitelist = (_audit|_introspection|_internal)

[tcpout:companyPVSNewIndexers]
server = masked6.com:9997, masked7.com:9997, masked8.com:9997, masked9.com:9997, masked10.com:9997
autoLB = true
autoLBFrequency = 40

I checked the servers, and they are correct

0 Karma
Highlighted

Re: Why Was My Event Indexed Twice? Same Host, Same File, Same _raw, Same Indextime, different splunk_server

Builder

FYI I have tried running the forwarder with useACK=false with no success. Any other ideas would be greatly appreciated as this is still happening for the index 😞

0 Karma
Highlighted

Re: Why Was My Event Indexed Twice? Same Host, Same File, Same _raw, Same Indextime, different splunk_server

SplunkTrust
SplunkTrust

So just to confirm, same source, sourcetype, index and host is producing duplicate log entries?

If you count the number of lines per hour or similar, does that add up to more than the number of lines within the source file on the server ?

If it is consistently duplicated then it would be worth checking the configuration of your outputs.conf ...

0 Karma