- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Json source type produces duplicated data
Hello,
I'm sending JSon data to the Http Event collector. When I exectute searches, all the non-metadata fields have duplicated values:
Which causes tons of issues when doing sum, count...
On my Splunk Cloud instance, I setup my source type this way, playing with KV_MODE, INDEXED_EXTRATIONS and AUTO_KV_JSON settings, but with no success...
Let me know what could be wrong?
Thanks for your help.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You'd need to use btool to check at the OS level for any configs for that source and sourcetype, e.g.,
splunk btool props list RanorexJSon
splunk btool props list source::ElectraExtendedUI
(Make sure to get the sourcetype and source names accurate). You're looking for parameters about indexed extractions. Since a props can apply to both a sourcetype and a source (as well as host, but that's less likely), search for both.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This problem indicates you have indexed field extraction enabled on your JSON events and are at the same time doing search time extraction of the JSON.
I typically recommend disabling indexed field extraction and do not rely on the built-in _json sourcetype, but instead use a more descriptive sourcetype that identifies the expected fields of the JSON, e.g. "myapp:json" which allows you to select it more readily for targeted additional processing or extraction.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello and thanks @mmccul_slac. I tried your option but didn't succeed... and doing a descriptive source type is kind of a hassle to do, especially for a well formatted json and when other of my sources are properly working 😉
I tried a few more things to see why this json was behaving differently than other Source Types but no luck. I ended up scrapping my "faulty" Source Type and, out of idea, linked another json Source Type to my http event collector. It worked, no duplicated values !!!
I then cloned this working Source Type, renamed it and replaced the cloned one as Source Type in my event collector:
-> I'm back with my duplicated messages ?!?
The only differences at this point, are the name of the Source Type and when it's been created...
Even though I'm not blocked anymore, I would like to be able to have a dedicated Source Type and need a proper explanation and solution of what is happening... At this point I would really like this to be a bug, so at least it explain the non-consistency of the behavior.
Thanks
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

As @mmccul_slac says, Indexed Extractions=true is what causes this behaviour. When JSON data comes in, if it's set to true, Splunk will parse and index the JSON data and when you search, Splunk will also parse and create fields from the JSON at search time, hence you get duplicates.
See this
and it may depend on where the data is coming from to HEC and whether it's coming from an intermediate Splunk Universal forwarder
