All Apps and Add-ons

JSON format - Duplicate value in field

olivier_ma
Explorer

Hello,

I'm currently working on a TA for browsing an Exchange mailbox and index some data extracted from emails.
I used the Add-on builder for this, and a python script as input method.

I've an issue with indexed data: every value of every field is duplicated.
alt text
alt text

I printed the JSON before writing the event into Splunk and it shows only 1 value.

{
    ...
    "content_type": "multipart/alternative;         boundary=\"b1_0b28091de0af32b14ad60d31c616a518\"",
    ...
    "date": "Tue, 7 Nov 2017 10:03:20 +0000", 
    ...
}

Here is a short part of the input script

for item in reporter_mailbox.inbox.all().iterator():
        # Analyze the item and get the JSON format
        json_data = self.__analyze_item(item)
        if json_data is None:
            self.helper.log_debug("No data written")
            continue
        self.helper.log_debug("JSON Data:\r\n %s" % json_data)
        # Create the Splunk event object related to this item
        event = self.helper.new_event(source=self.helper.get_input_type(), index=self.helper.get_output_index(),
                                      sourcetype=self.helper.get_sourcetype(), data=json_data)
        # Write the event into Splunk
        splunk_writer.write_event(event)

Also the props.conf

[my_ta]
#INDEXED_EXTRACTIONS = JSON
#KV_MODE = json
TIMESTAMP_FIELDS = date
TRUNCATE = 0
category = Splunk App Add-on Builder
pulldown_type = 1

# Fields Aliases
...

Anyone has an idea about where those duplicate values come from ?

Thanks for your help

0 Karma
1 Solution

mayurr98
Super Champion

hey

add this in props.conf

KV_MODE = none
AUTO_KV_JSON = false
INDEXED_EXTRACTIONS = JSON

let me know if this helps !

View solution in original post

mayurr98
Super Champion

hey

add this in props.conf

KV_MODE = none
AUTO_KV_JSON = false
INDEXED_EXTRACTIONS = JSON

let me know if this helps !

View solution in original post

olivier_ma
Explorer

Exactly what I'm looking for. Thanks 🙂

0 Karma

asieira
Path Finder

Take a look at https://answers.splunk.com/answers/223095/why-is-my-sourcetype-configuration-for-json-events.html and hopefully the solution for that also applies here.

Also, I see you have a commented-out INDEXED_EXTRACTIONS configuration. If you had that enabled when the events were indexed, that would have created one value for each field and saved that to disk persistently at index time. When you later disabled INDEXED_EXTRACTIONS and used the AUTO_KV_MODE to extract fields at search time, both field values would be combined for existing events. That could be the root cause of the duplication on existing events, no?

olivier_ma
Explorer

Thanks, that was the good solution.

0 Karma
Did you miss .conf21 Virtual?

Good news! The event's keynotes and many of its breakout sessions are now available online, and still totally FREE!