Getting Data In

Why does my json event data show duplicate fields?

bnorthway
Path Finder

In my screenshot, you can see my events have duplicate fields. I am trying to figure out why this is occurring. The source data is a json in a plain text file. The source json data does not have duplicate fields, and the events themselves are not duplicated -- only the fields. Do you have any ideas why this could be occurring?

Here's my props.conf

[my_json]
INDEXED_EXTRACTIONS = json
KV_MODE = json
NO_BINARY_CHECK = true
TIMESTAMP_FIELDS = timestamp
category = Structured
description = JSON. Timestamp is in "timestamp" field
disabled = false
pulldown_type = true
1 Solution

ppablo
Retired

Hi @bnorthway

Here's a previous Answers post that might give you some clues on how to edit your configuration. The users in the comment thread directly under the question and the answer itself explain how having both index and search-time settings enabled caused duplicate field extractions in their cases. I hope it helps.
http://answers.splunk.com/answers/223095/why-is-my-sourcetype-configuration-for-json-events.html

View solution in original post

ppablo
Retired

Hi @bnorthway

Here's a previous Answers post that might give you some clues on how to edit your configuration. The users in the comment thread directly under the question and the answer itself explain how having both index and search-time settings enabled caused duplicate field extractions in their cases. I hope it helps.
http://answers.splunk.com/answers/223095/why-is-my-sourcetype-configuration-for-json-events.html

bnorthway
Path Finder

Thanks. That what I was figuring out also. What is the difference in removing INDEXED_EXTRACTIONS = JSON and adding AUTO_KV_JSON = false? It looks like end-user performance would be better through the former, so that the fields don't have to extracted at every search?

0 Karma

ppablo
Retired

I'm not entirely sure, but generally search-time field extractions are supposed to be better for overall performance. I looked through documentation explaining this and found it on this page:
http://docs.splunk.com/Documentation/Splunk/6.2.3/Indexer/Indextimeversussearchtime

"As a general rule, it is better to perform most knowledge-building activities, such as field extraction, at search time. Additional, custom field extraction, performed at index time, can degrade performance at both index time and search time. When you add to the number of fields extracted during indexing, the indexing process slows. Later, searches on the index are also slower, because the index has been enlarged by the additional fields, and a search on a larger index takes longer. You can avoid such performance issues by instead relying on search-time field extraction."

The same message is communicated in the props.conf documentation.
http://docs.splunk.com/Documentation/Splunk/6.2.3/admin/Propsconf

Hopefully someone who is well versed in this will come along can go into more detail for you.

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...