Getting Data In

Why would INDEXED_EXTRACTIONS=JSON in props.conf be resulting in duplicate values?

pumphreyaw
Explorer

Using Splunk to analyze bro network transaction data in JSON format. I noticed the stats command and field summary stats would show a count of 2 for unique session ID's, although search results only show one event. After a lot of verification I'm certain my event source does not contain duplicate events.

Thanks to this post: https://answers.splunk.com/answers/223095/why-is-my-sourcetype-configuration-for-json-events.html, I started messing with my JSON settings in props.conf. I thought this would be my fix, but I found the opposite scenario to be true for me...

In short, I'm seeing that using index-time JSON field extractions are resulting in duplicate field values, where search-time JSON field extractions are not.

In props.conf, this produces duplicate values, visible in stats command and field summaries:

INDEXED_EXTRACTIONS=JSON
KV_MODE=none
AUTO_KV_JSON=false

If I disable indexed extractions and use search-time extractions instead, no more duplicate field values:

#INDEXED_EXTRACTIONS=JSON
KV_MODE=json
AUTO_KV_JSON=true  

From what I can tell this behavior is different than what others reported in earlier posts. I'm running Splunk 6.6.2 Enterprise on a Debian VM and a 6.6.2 Universal Forwarder on another VM. Maybe there is a deployment client configuration I have wrong somewhere that is causing weird behavior for index-time extractions but no luck so far.

Using search-time extractions seems to work fine, but wondering if anyone is seeing this or if there are any ideas on root cause.

Thanks.

1 Solution

mattymo
Splunk Employee
Splunk Employee

Hey pumphreyaw!

It comes down to WHERE you make these changes. If you use INDEXED_EXTRACTIONS, the props.conf needs to be on the UF ( Universal Forwarder VM ), and the KV_MODE=NONE needs to be on the Search Head (aka your Splunk Enterprise VM).

From what I read above, setting the INDEXED_EXTRACTIONS and disabling KV_MODE=JSON should work.

Where did you disable the KV_MODE configs?

- MattyMo

View solution in original post

mattymo
Splunk Employee
Splunk Employee

Hey pumphreyaw!

It comes down to WHERE you make these changes. If you use INDEXED_EXTRACTIONS, the props.conf needs to be on the UF ( Universal Forwarder VM ), and the KV_MODE=NONE needs to be on the Search Head (aka your Splunk Enterprise VM).

From what I read above, setting the INDEXED_EXTRACTIONS and disabling KV_MODE=JSON should work.

Where did you disable the KV_MODE configs?

- MattyMo

joesrepsolc
Communicator

Any easy to read lists exist of WHERE to use each of these options in the props.conf? I run into this from time to time and its not 100% clear to me WHERE they need to go.

Sometimes it clears says "input time" on this reference (https://docs.splunk.com/Documentation/Splunk/7.2.4/Admin/Propsconf) but other times it doesn't and I'm not sure what that means then.

 

Any help would be GREAT!!!

0 Karma

mallempati
New Member

hi @mmodestino [Splunk] ♦

By removing the INDEXED_EXTRACTIONS = json from the props.conf on the UF has fixed the issue of duplicates. But it started giving another issue that is sometimes its missing few json event lines.

KV_MODE = none
NO_BINARY_CHECK = true
TIMESTAMP_FIELDS = requests.Time
category = Structured
disabled = false
pulldown_type = true

Any idea how to fix this issue.

0 Karma

jperry_intact
New Member

I cannot get this to work for the life of me. I can get the json events to only index once if I upload the file and select the sourcetype. If I set it as a monitor input for the same sourcetype and the same files, I get duplicate events. Initially I was getting duplicate events(same event listed twice) and duplicate field extractions(1 field, 2 identical values). Adding INDEXED_EXTRACTIONS = JSON seemed to fix the duplicate field extractions

Its on a single server install on my local machine and I have tried creating the props.conf entry below in both C:\Program Files\Splunk\etc\system\local and C:\Program Files\Splunk\etc\apps\INSERTAPPNAMEHERE\local and no dice.

[FishNPickles]
INDEXED_EXTRACTIONS = JSON
TIMESTAMP_FIELDS = properties.LastUpdateTime
TZ = UTC
AUTO_KV_JSON = false
DATETIME_CONFIG =
KV_MODE = none
SHOULD_LINEMERGE = false
category = Custom
description = PicklesNFish
disabled = false
pulldown_type = true

Is there some secret sauce to this I'm missing? It just straight up ignores the KV_MODE settings and is still indexing my entities twice.

Any direction you could provide would be ultra awesome and greatly appreciated!

0 Karma

jperry_intact
New Member

I have apparently done something horrible to my local install. I brought up a new host the your solution works great.

Who knows...

0 Karma

pumphreyaw
Explorer

I think you nailed it. The props.conf file I'm modifying in this case belongs to a deployment app that's getting pushed to the UF, none of which is going to the Search Head. I see I need to split these props settings up accordingly. I'll give that a try. Thanks for the help and quick reply.

0 Karma

mattymo
Splunk Employee
Splunk Employee

awesome, I have converted the comment to answer. Let me know if it works!

- MattyMo
0 Karma

pumphreyaw
Explorer

Yep, that worked perfectly. Oversight on my part, just needed to put things in the right place.

Thanks mmodestino!

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...