Getting Data In
Highlighted

Ingesting JSON data via a python script, why are fields with numeric values indexed as multivalue fields with two identical values?

Path Finder

I have a json file with entries in the following form:
{ "ABC" : "XYZ" , "DEF" : 123 , "GHI" : "456" , ... }
There are about 15 or so variables defined in a single json formatted line with multiple lines for a given output.

Splunk picks up the output via a python script which essentially prints everything to stdout.

The issue I'm having is that, when Splunk ingests the data, some of the fields end up being multivalued where a field has two identical values. I can see this occurring when I click on the "show as raw text" in the splunk search results.

Somewhat interesting is that these fields are all fields with numerical values in them. So it's occurring for both "DEF": 123 and "GHI" : "456" types..

Any ideas as to what could be causing this issue?

Highlighted

Re: Ingesting JSON data via a python script, why are fields with numeric values indexed as multivalue fields with two identical values?

Motivator

Are there multiple entries of "ABC" : "123" , for example? If so, that would explain it.

0 Karma
Highlighted

Re: Ingesting JSON data via a python script, why are fields with numeric values indexed as multivalue fields with two identical values?

Esteemed Legend

Your problem is probably the same as this:

http://answers.splunk.com/answers/301165/splunk-app-for-aws-billing-why-is-a-single-entry-o.html#ans...

You are probably telling Splunk to extract JSON fields twice: once at index time ( INDEXED_EXTRACTIONS=json ) and once at search time ( KV_MODE=json ). Get rid of the KV_MODE setting.

See this Q&A for a more complete discussion:

http://answers.splunk.com/answers/174939/why-are-my-json-fields-extracted-twice.html

0 Karma
Highlighted

Re: Ingesting JSON data via a python script, why are fields with numeric values indexed as multivalue fields with two identical values?

Motivator

That is a good possibility. Would we see a similar mechanic if sourcetype=json (auto-sourcetyping) or a transforms call from props on an indexer? What are your thoughts on index time extractions vs search time?

0 Karma
Highlighted

Re: Ingesting JSON data via a python script, why are fields with numeric values indexed as multivalue fields with two identical values?

Esteemed Legend

Yes. For JSON, the events are fairly useless without extracting them so you are way better off doing it once for everybody at Index time rather than for every search (unless you have HUGE numbers of events that are rarely searched).

0 Karma