Splunk Search

Why are my json fields extracted twice?

Splunk Employee
Splunk Employee

I have json events like : { A:"1",B:"2",C:"3"}
with a sourcetype named json_app

When I search the fields, I get 2 values , but not the source/sourcetype/index/_time
A =" 1 , 1" like multivalues

This started in splunk 6.1

Tags (2)
1 Solution

Splunk Employee
Splunk Employee

Found the explanation, this is the new feature "INDEXED_EXTRACTIONS"
http://docs.splunk.com/Documentation/Splunk/6.1.4/Data/Extractfieldsfromfileheadersatindextime

This does the index time extraction of the fields, at parsing time, this means that the FORWARDERS are parsing the events if specified.
Remark : It also means that for those special events, the timestamp is now extracted by the forwarder, and that the filtering is done by the forwarders

example props.conf on the forwarder

[json_app]
INDEXED_EXTRACTIONS=json   

My problem is that I also have a search-time automatic extraction of the fields for json data.

[json_app]
KV_MODE=json

So at the end I got 2 times the same fields.

To fix the issue, I simply disabled the KV_MODE on the search-head, and reloaded with the search command | extract reload=true

Here is my final props for my sourcetype.

[json_app]
INDEXED_EXTRACTIONS=json
KV_MODE=none

View solution in original post

Splunk Employee
Splunk Employee

Found the explanation, this is the new feature "INDEXED_EXTRACTIONS"
http://docs.splunk.com/Documentation/Splunk/6.1.4/Data/Extractfieldsfromfileheadersatindextime

This does the index time extraction of the fields, at parsing time, this means that the FORWARDERS are parsing the events if specified.
Remark : It also means that for those special events, the timestamp is now extracted by the forwarder, and that the filtering is done by the forwarders

example props.conf on the forwarder

[json_app]
INDEXED_EXTRACTIONS=json   

My problem is that I also have a search-time automatic extraction of the fields for json data.

[json_app]
KV_MODE=json

So at the end I got 2 times the same fields.

To fix the issue, I simply disabled the KV_MODE on the search-head, and reloaded with the search command | extract reload=true

Here is my final props for my sourcetype.

[json_app]
INDEXED_EXTRACTIONS=json
KV_MODE=none

View solution in original post

Engager

Thanks a lot!
I was getting crazy trying to understand why that was happening.

0 Karma

Explorer

I have the same problem where I am getting duplicated field values from my json logs. I have a universal forwarder that sends data to a heavy forwarder, which then sends that data to indexers. I have the following props.conf in each layer where the INDEXED_EXTRACTIONS is set to 'json' at the universal forwarder and set to none at every other layer (heavy forwarder, indexer, and search head). I don't understand why the fields are still getting extracted twice:

universal forwarder props.conf:
INDEXEDEXTRACTIONS = json
KV
MODE = none
CHARSET = UTF-8
SHOULDLINEMERGE = true
NO
BINARYCHECK = true
TRUNCATE = 500000
pulldown
type = true
category = Structured
description = CAP - Ramp Document Monitoring
AUTOKVJSON = false

Heavy forwarder props.conf:
INDEXEDEXTRACTIONS = none
KV
MODE = none
CHARSET = UTF-8
SHOULDLINEMERGE = true
NO
BINARYCHECK = true
TRUNCATE = 500000
pulldown
type = true
category = Structured
description = CAP - Ramp Document Monitoring
AUTOKVJSON = false

Indexer props.conf:
INDEXEDEXTRACTIONS = none
KV
MODE = none
CHARSET = UTF-8
SHOULDLINEMERGE = true
NO
BINARYCHECK = true
TRUNCATE = 500000
pulldown
type = true
category = Structured
description = CAP - Ramp Document Monitoring
AUTOKVJSON = false

Search Head props.conf:
INDEXEDEXTRACTIONS = none
KV
MODE = none
CHARSET = UTF-8
SHOULDLINEMERGE = true
NO
BINARYCHECK = true
TRUNCATE = 500000
pulldown
type = true
category = Structured
description = CAP - Ramp Document Monitoring
AUTOKVJSON = false

0 Karma

Moderator
Moderator

Hi Vinit

It would be better if you post this as a new question, since this post is 3 years old and you might not get as much visibility on your question.

Thanks

0 Karma

Communicator

Remark : It also means that for those special events, the timestamp is now extracted by the forwarder, and that the filtering is done by the forwarders

This sounds confusing - do you mean Light/Heavy forwarders or Universal Forwarders? The wiki (http://wiki.splunk.com/Where_do_I_configure_my_Splunk_settings%3F) does mention the same thing, that INDEXED_EXTRACTIONS is done in the input stage. This seems an unfortunate effect, have you seen any increased workload on the UF?

0 Karma