Hello,
I am running into the "common" issue of duplicated JSON fields. I use Splunk Enterprise 9.2, with an Universal Forwarder, an indexer cluster, and a search head cluster.
My forwarder as the following configuration files :
/opt/splunkforward/etc/system/local/inputs.conf
# Forwarder, /opt/splunkforward/etc/system/local/inputs.conf
[batch:///opt/splunk_input/index_d/*]
move_policy = sinkhole
index= index_d
source = index_d
sourcetype = index_d
time_before_close = 0
crcSalt = <SOURCE>
blacklist = \.lock$
/opt/splunkforward/etc/system/local/props.conf
[index_d]
# Universal Forwarder, /opt/splunkforward/etc/system/local/props.conf
INDEXED_EXTRACTIONS = JSON
KV_MODE = none
AUTO_KV = false
AUTO_KV_JSON = falseOn my search head, I directly edited the system file /opt/splunk/etc/system/local/props.conf with the following :
[index_d]
# Search Head, /opt/splunk/etc/system/local/props.conf
INDEXED_EXTRACTION = JSON
KV_MODE = none
AUTO_KV = false
AUTO_KV_JSON = false
FIELD_DISCOVERY = false
With this configuration, I get duplicated values for all extracted fields.
I checked on my search head that those fields are correctly applied, using :
splunk btool props list index_d
which correctly lists the value from the props.conf file, I would assume those are correctly setup. If i edited directly the local file on the system folder, it was to avoid permissions issues as listed here : https://splunk.my.site.com/customer/s/article/Field-Value-Type-Discrepancies-in-KV-MODE
I also tried to run on the search head :
[index_d]
# Commenting out the INDEXED_EXTRACTION field on the Search head
# INDEXED_EXTRACTION = JSON
KV_MODE = none
AUTO_KV = false
AUTO_KV_JSON = false
FIELD_DISCOVERY = false
But no luck
I spent some time reading similar questions about this topic, and sadly no solution tried so far helped me.
I happily welcome any suggestion, thank you
First things first - don't use indexed extractions unless there is absolutely no other way.
BTW, crcSalt=<SOURCE> is also very rarely the way to go. Usually it's better to make the checksum block longer if the files have common header. It also shouldn't be needed with batch input.
And avoid touching etc/system/local. Whenever possible, deploy your settings in an app.
OK, having this one out of the way - I'd check if there aren't other effective settings (host and source-based settings have precedence over general sourcetype-defined ones).
Thank you @PickleRick for the answer. I am doing my tests on a test environment so far, so I have no issue editing system local files or such. I plan to have an app, with Global permissions, once I managed to have the correct settings.
> First things first - don't use indexed extractions unless there is absolutely no other way.
so, simply removing INDEXED_EXTRACTIONS=JSON would do the trick here ? I tried to do so, and :
- the fields are not duplicated (yay!)
- file which contents is a json array are not supported anymore, for example [{'key1': 'val1'}, {'key2':'val2'}], while they were supposed before
- the web search display some extracted key values, but there is a field named "punct" with just a list of comma / quotes / brackets, which led me to believe that the data isn't read / parsed fully
> I'd check if there aren't other effective settings (host and source-based settings have precedence over general sourcetype-defined ones).
Out of a quick check, I do not find anything. Is there a good way to check for everything ?
Just know that, for a test only, I added the following stanza in my search head /opt/splunk/etc/system/local/props.conf
[default]
KV_MODE = none
AUTO_KV = false
AUTO_KV_JSON = false
FIELD_DISCOVERY = falseto try to have precedence over any other props.conf setting file, but this did not change anything.
I have some speculations.
- the web search display some extracted key values, but there is a field named "punct" with just a list of comma / quotes / brackets, which led me to believe that the data isn't read / parsed fully
addition of punct is consistent with event being too big for search-time extraction.
Since you have all search-time extractions disabled, Splunk is not... doing extractions. That's why the fields are not parsed out. You disabled index-time parsing, you should enable KV_MODE=json.
It is puzzling though why you had duplicate fields with just indexed extractions. Normally this is a sign of both extractions taking place - index-time with indexed extractions and search time with kvmode (either explicitly set or automatic).