Splunk Search

Why are my json fields extracted twice?

yannK
Splunk Employee
Splunk Employee

I have json events like : { A:"1",B:"2",C:"3"}
with a sourcetype named json_app

When I search the fields, I get 2 values , but not the source/sourcetype/index/_time
A =" 1 , 1" like multivalues

This started in splunk 6.1

Tags (2)
1 Solution

yannK
Splunk Employee
Splunk Employee

Found the explanation, this is the new feature "INDEXED_EXTRACTIONS"
http://docs.splunk.com/Documentation/Splunk/6.1.4/Data/Extractfieldsfromfileheadersatindextime

This does the index time extraction of the fields, at parsing time, this means that the FORWARDERS are parsing the events if specified.
Remark : It also means that for those special events, the timestamp is now extracted by the forwarder, and that the filtering is done by the forwarders

example props.conf on the forwarder

[json_app]
INDEXED_EXTRACTIONS=json   

My problem is that I also have a search-time automatic extraction of the fields for json data.

[json_app]
KV_MODE=json

So at the end I got 2 times the same fields.

To fix the issue, I simply disabled the KV_MODE on the search-head, and reloaded with the search command | extract reload=true

Here is my final props for my sourcetype.

[json_app]
INDEXED_EXTRACTIONS=json
KV_MODE=none

View solution in original post

yannK
Splunk Employee
Splunk Employee

Found the explanation, this is the new feature "INDEXED_EXTRACTIONS"
http://docs.splunk.com/Documentation/Splunk/6.1.4/Data/Extractfieldsfromfileheadersatindextime

This does the index time extraction of the fields, at parsing time, this means that the FORWARDERS are parsing the events if specified.
Remark : It also means that for those special events, the timestamp is now extracted by the forwarder, and that the filtering is done by the forwarders

example props.conf on the forwarder

[json_app]
INDEXED_EXTRACTIONS=json   

My problem is that I also have a search-time automatic extraction of the fields for json data.

[json_app]
KV_MODE=json

So at the end I got 2 times the same fields.

To fix the issue, I simply disabled the KV_MODE on the search-head, and reloaded with the search command | extract reload=true

Here is my final props for my sourcetype.

[json_app]
INDEXED_EXTRACTIONS=json
KV_MODE=none

PatG_
Engager

Thanks a lot!
I was getting crazy trying to understand why that was happening.

0 Karma

vinit_masaun
Explorer

I have the same problem where I am getting duplicated field values from my json logs. I have a universal forwarder that sends data to a heavy forwarder, which then sends that data to indexers. I have the following props.conf in each layer where the INDEXED_EXTRACTIONS is set to 'json' at the universal forwarder and set to none at every other layer (heavy forwarder, indexer, and search head). I don't understand why the fields are still getting extracted twice:

universal forwarder props.conf:
INDEXED_EXTRACTIONS = json
KV_MODE = none
CHARSET = UTF-8
SHOULD_LINEMERGE = true
NO_BINARY_CHECK = true
TRUNCATE = 500000
pulldown_type = true
category = Structured
description = CAP - Ramp Document Monitoring
AUTO_KV_JSON = false

Heavy forwarder props.conf:
INDEXED_EXTRACTIONS = none
KV_MODE = none
CHARSET = UTF-8
SHOULD_LINEMERGE = true
NO_BINARY_CHECK = true
TRUNCATE = 500000
pulldown_type = true
category = Structured
description = CAP - Ramp Document Monitoring
AUTO_KV_JSON = false

Indexer props.conf:
INDEXED_EXTRACTIONS = none
KV_MODE = none
CHARSET = UTF-8
SHOULD_LINEMERGE = true
NO_BINARY_CHECK = true
TRUNCATE = 500000
pulldown_type = true
category = Structured
description = CAP - Ramp Document Monitoring
AUTO_KV_JSON = false

Search Head props.conf:
INDEXED_EXTRACTIONS = none
KV_MODE = none
CHARSET = UTF-8
SHOULD_LINEMERGE = true
NO_BINARY_CHECK = true
TRUNCATE = 500000
pulldown_type = true
category = Structured
description = CAP - Ramp Document Monitoring
AUTO_KV_JSON = false

0 Karma

Anam
Community Manager
Community Manager

Hi Vinit

It would be better if you post this as a new question, since this post is 3 years old and you might not get as much visibility on your question.

Thanks

0 Karma

laserval
Communicator

Remark : It also means that for those special events, the timestamp is now extracted by the forwarder, and that the filtering is done by the forwarders

This sounds confusing - do you mean Light/Heavy forwarders or Universal Forwarders? The wiki (http://wiki.splunk.com/Where_do_I_configure_my_Splunk_settings%3F) does mention the same thing, that INDEXED_EXTRACTIONS is done in the input stage. This seems an unfortunate effect, have you seen any increased workload on the UF?

0 Karma
Get Updates on the Splunk Community!

Troubleshooting the OpenTelemetry Collector

  In this tech talk, you’ll learn how to troubleshoot the OpenTelemetry collector - from checking the ...

Adoption of Infrastructure Monitoring at Splunk

  Splunk's Growth Engineering team showcases one of their first Splunk product adoption-Splunk Infrastructure ...

Modern way of developing distributed application using OTel

Recently, I had the opportunity to work on a complex microservice using Spring boot and Quarkus to develop a ...