Solved: Re: Why Was My Event Indexed Twice? Same Host, Sam...

paimonsoror · ‎01-04-2017

Hi All;

Noticed something very interesting and I don't seem to find the smoking gun. Today I was alerted by one of my users that he saw some duplicate events, more recently today at 2:20pm. I took a look, and sure enough I was able to find that an event was duplicated. I narrowed the result down to the following search:

index=app_harmony sourcetype=harmony:\*:access "[04/Jan/2017:14:20*" host=cimasked0047 | eval idxtime=_indextime | table _time idxtime host source splunk_server _raw

The results were as follows:

_time   idxtime host    source  splunk_server   _raw
2017-01-04 14:20:02 1483557602  hostname    /usr/local/openresty/nginx/logs/access.log  splunkindex0006 xx.xx.xxx.xxx - - [04/Jan/2017:14:20:02 -0500] "GET /api/harmony/v1/User/G_IAS_6e680df53be6a06f9e11faf40812dc8c?domain=internal HTTP/1.1" 200 317 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.95 Safari/537.36" "-"

2017-01-04 14:20:02 1483557602  hostname    /usr/local/openresty/nginx/logs/access.log  splunkindex0009 xx.xx.xx.xxx - - [04/Jan/2017:14:20:02 -0500] "GET /api/harmony/v1/User/G_IAS_6e680df53be6a06f9e11faf40812dc8c?domain=internal HTTP/1.1" 200 317 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.95 Safari/537.36" "-"

The raw event is exactly the same, as is the index time. The only difference between the two events is our indexer. We have a cluster of 5 indexers with a RF of 3.

I double checked the file, and there isn't more than 1 event at that time.

I also verified that I dont see any random '.filepart' or temp files or anything appearing

the input for that particular file is pretty strict and as follows:

[monitor:///usr/local/openresty/nginx/logs/access.log]
index = app_harmony
sourcetype = harmony:openresty:access

I took a look at the internal logs, and didn't see any errors around that time.

skalliger · ‎01-17-2017

Actually, you have an unnecessary second tcpout stanza in there. I don't know if that's the problem, but you should remove one tcpout stanza.

Do you use a master server? If so, change your tcpout stanzas to something like this (and the others settings you'd like to have):

[tcpout]
defaultGroup = companyPVSNewIndexers
indexAndForward = false

[tcpout:companyPVSNewIndexer]
indexerDiscovery = YourIndexerCluster
//additional settings for SSL required when enabled

[indexer_discovery:YourIndexerCluster]
pass4SymmKey = YourKey    //defined on master
master_uri = https://YourMasterServer:8089

If you don't use a master (I don't think this is recommended), just merge both tcpout stanzas into one. Not sure about the autoLB settings though.

Edit: typo

View solution in original post

skalliger · ‎01-17-2017

Actually, you have an unnecessary second tcpout stanza in there. I don't know if that's the problem, but you should remove one tcpout stanza.

Do you use a master server? If so, change your tcpout stanzas to something like this (and the others settings you'd like to have):

[tcpout]
defaultGroup = companyPVSNewIndexers
indexAndForward = false

[tcpout:companyPVSNewIndexer]
indexerDiscovery = YourIndexerCluster
//additional settings for SSL required when enabled

[indexer_discovery:YourIndexerCluster]
pass4SymmKey = YourKey    //defined on master
master_uri = https://YourMasterServer:8089

If you don't use a master (I don't think this is recommended), just merge both tcpout stanzas into one. Not sure about the autoLB settings though.

Edit: typo

paimonsoror · ‎01-17-2017

Oh very interesting! We do have a cluster master. We actually had a consultant set our outputs.conf up that way. I wonder why he did it that way instead of relying on the CM to delegate the indexers

paimonsoror · ‎01-19-2017

I think that was the issue, the duplicate stanza! Thanks @skalliger for pointing that one out! Sent a note to the admins to promote as the correct answer.

skalliger · ‎01-19-2017

Glad it helped you! Just wondering why you reported my comment though, haha.

paimonsoror · ‎01-19-2017

Because you hurt my feelings with your answer. Lol no i reported it as 'other' to get the mods attention to promote it to an 'accepted answer'

skalliger · ‎01-19-2017

Ah, alright. I get way too many Splunk Answers notifications, haha. I converted it. 🙂

paimonsoror · ‎01-19-2017

You rock, thanks again.

gjanders · ‎01-17-2017

I've had a number of issues the the indexer discovery and I prefer not to use it now!

Your duplicate tcpout stanza might be an issue though...that's an easy fix.

-
Alerts for Splunk Admins, Version Control for Splunk, Decrypt2 VersionControl For SplunkCloud

paimonsoror · ‎01-12-2017

FYI I have tried running the forwarder with useACK=false with no success. Any other ideas would be greatly appreciated as this is still happening for the index 😞

gjanders · ‎01-12-2017

So just to confirm, same source, sourcetype, index and host is producing duplicate log entries?

If you count the number of lines per hour or similar, does that add up to more than the number of lines within the source file on the server ?

If it is consistently duplicated then it would be worth checking the configuration of your outputs.conf ...

-
Alerts for Splunk Admins, Version Control for Splunk, Decrypt2 VersionControl For SplunkCloud

paimonsoror · ‎01-13-2017

So here is what I got:

In Splunk-> index=app_harmony source=/usr/local/openresty/nginx/logs/access.log "13/Jan/2017:08:"
events = 4678

Grepping the actual log file : less access.log | grep -o '13/Jan/2017:08' | wc -l
lines = 1080

gjanders · ‎01-15-2017

If you run a dedup in Splunk so:
index=app_harmony source=/usr/local/openresty/nginx/logs/access.log "13/Jan/2017:08:" | dedup

Do you see much less events ? The numbers are strange you have a lot more than double the number of events in Splunk...also I notice you did not narrow down to a host, are there multiple hosts that this log comes from or just one?

-
Alerts for Splunk Admins, Version Control for Splunk, Decrypt2 VersionControl For SplunkCloud

paimonsoror · ‎01-16-2017

Ah Jeez!! You hare 100% correct, sorry, retried:

[splkadmn@cmasked0083 logs]$ less access.log | grep "13/Jan/2017:08:" | wc -l
1140

index=app_harmony host=cmasked0083 source=/usr/local/openresty/nginx/logs/access.log "13/Jan/2017:08:"
1,959 events

index=app_harmony host=cmasked0083 source=/usr/local/openresty/nginx/logs/access.log "13/Jan/2017:08:" | dedup _raw
1,140 events

gjanders · ‎01-16-2017

Interesting, during the time period where you have the duplicates, are there any warning/error messages from the splunkd? crc checksum or duplication errors in particular...

-
Alerts for Splunk Admins, Version Control for Splunk, Decrypt2 VersionControl For SplunkCloud

paimonsoror · ‎01-17-2017

I wish there were, but nothing in the splunkd on those forwarders 😞

sjohnson_splunk · ‎01-04-2017

Does the forwarder have a useACK=true in the outputs.conf?

One of the known side effects of this setting can be data duplication in rare cases.

Search for this message in the _internal splunkd.log on the forwarder:

WARN TcpOutputProc - Possible duplication of events with

paimonsoror · ‎01-10-2017

I took another look and did notice that I do uave 'useACK' as true. This is a default on all of our forwarders:

[tcpout]
defaultGroup = companyPVSNewIndexers
# These two options below are required for forwarders when clustering.
# Max queue size ensures that the forwarder has enough of a buffer while
# waiting for the ACK from the indexer; without useACK, the search head
# will spout yellow warning banners in a clustered environment.
maxQueueSize = 7MB
useACK = true

Should I try running without useACK for this one forwarder?

MuS · ‎01-16-2017

maybe it's worth checking the setting for the servers in outputs.conf as well, because if they are wrong this can end in duplicated events as well.

paimonsoror · ‎01-17-2017

So here is the whole outputs file. To save space on this post I have removed all the settings that have been commented out.

Company BASE SETTINGS

[tcpout]
defaultGroup = companyPVSNewIndexers
# These two options below are required for forwarders when clustering.
# Max queue size ensures that the forwarder has enough of a buffer while
# waiting for the ACK from the indexer; without useACK, the search head
# will spout yellow warning banners in a clustered environment.
maxQueueSize = 7MB
useACK = true

[tcpout]
defaultGroup = companyPVSNewIndexers
indexAndForward = false

# When indexing a large continuous file that grows very large, a universal
# or light forwarder may become "stuck" on one indexer, trying to reach
# EOF before being able to switch to another indexer. The symptoms of this
# are congestion on *one* indexer in the pool while others seem idle, and
# possibly uneven loading of the disk usage for the target index.
# In this instance, forceTimebasedAutoLB can help!
# ** Do not enable if you have events > 64kB **
forceTimebasedAutoLB = true

# Correct an issue with the default outputs.conf for the Universal Forwarder
# or the SplunkLightForwarder app; these don't forward _internal events.
forwardedindex.2.whitelist = (_audit|_introspection|_internal)

[tcpout:companyPVSNewIndexers]
server = masked6.com:9997, masked7.com:9997, masked8.com:9997, masked9.com:9997, masked10.com:9997
autoLB = true
autoLBFrequency = 40

I checked the servers, and they are correct

paimonsoror · ‎01-05-2017

Thanks for the quick response. I checked for that yesterday around the time of the duplication, and didn't see anything. Also tried a week long search to verify.

Why Was My Event Indexed Twice? Same Host, Same File, Same _raw, Same Indextime, different splunk_server

Earn a $35 Gift Card for Answering our Splunk Admins & App Developer Survey

Continuing Innovation & New Integrations Unlock Full Stack Observability For Your ...

Monitoring Amazon Elastic Kubernetes Service (EKS)