Using Splunk to analyze bro network transaction data in JSON format. I noticed the stats command and field summary stats would show a count of 2 for unique session ID's, although search results only show one event. After a lot of verification I'm certain my event source does not contain duplicate events.
Thanks to this post: https://answers.splunk.com/answers/223095/why-is-my-sourcetype-configuration-for-json-events.html, I started messing with my JSON settings in props.conf. I thought this would be my fix, but I found the opposite scenario to be true for me...
In short, I'm seeing that using index-time JSON field extractions are resulting in duplicate field values, where search-time JSON field extractions are not.
In props.conf, this produces duplicate values, visible in stats command and field summaries:
INDEXED_EXTRACTIONS=JSON KV_MODE=none AUTO_KV_JSON=false
If I disable indexed extractions and use search-time extractions instead, no more duplicate field values:
#INDEXED_EXTRACTIONS=JSON KV_MODE=json AUTO_KV_JSON=true
From what I can tell this behavior is different than what others reported in earlier posts. I'm running Splunk 6.6.2 Enterprise on a Debian VM and a 6.6.2 Universal Forwarder on another VM. Maybe there is a deployment client configuration I have wrong somewhere that is causing weird behavior for index-time extractions but no luck so far.
Using search-time extractions seems to work fine, but wondering if anyone is seeing this or if there are any ideas on root cause.
It comes down to WHERE you make these changes. If you use INDEXEDEXTRACTIONS, the props.conf needs to be on the UF ( Universal Forwarder VM ), and the KVMODE=NONE needs to be on the Search Head (aka your Splunk Enterprise VM).
From what I read above, setting the INDEXEDEXTRACTIONS and disabling KVMODE=JSON should work.
Where did you disable the KV_MODE configs?
I think you nailed it. The props.conf file I'm modifying in this case belongs to a deployment app that's getting pushed to the UF, none of which is going to the Search Head. I see I need to split these props settings up accordingly. I'll give that a try. Thanks for the help and quick reply.
awesome, I have converted the comment to answer. Let me know if it works!
Yep, that worked perfectly. Oversight on my part, just needed to put things in the right place.
I cannot get this to work for the life of me. I can get the json events to only index once if I upload the file and select the sourcetype. If I set it as a monitor input for the same sourcetype and the same files, I get duplicate events. Initially I was getting duplicate events(same event listed twice) and duplicate field extractions(1 field, 2 identical values). Adding INDEXED_EXTRACTIONS = JSON seemed to fix the duplicate field extractions
Its on a single server install on my local machine and I have tried creating the props.conf entry below in both C:\Program Files\Splunk\etc\system\local and C:\Program Files\Splunk\etc\apps\INSERTAPPNAMEHERE\local and no dice.
INDEXEDEXTRACTIONS = JSON
TIMESTAMPFIELDS = properties.LastUpdateTime
TZ = UTC
AUTOKVJSON = false
KVMODE = none
SHOULDLINEMERGE = false
category = Custom
description = PicklesNFish
disabled = false
pulldowntype = true
Is there some secret sauce to this I'm missing? It just straight up ignores the KV_MODE settings and is still indexing my entities twice.
Any direction you could provide would be ultra awesome and greatly appreciated!
I have apparently done something horrible to my local install. I brought up a new host the your solution works great.
hi @mmodestino [Splunk] ♦
By removing the INDEXED_EXTRACTIONS = json from the props.conf on the UF has fixed the issue of duplicates. But it started giving another issue that is sometimes its missing few json event lines.
KVMODE = none
NOBINARYCHECK = true
TIMESTAMPFIELDS = requests.Time
category = Structured
disabled = false
pulldown_type = true
Any idea how to fix this issue.