Getting Data In

How do I go about the parsing of Splunk modular input with JSON data?

New Member

Hi Splunk community,

I am facing some issue in using the Splunk modular input.

The modular input is built around events and event writers that are constructing an xml event. But, when i search the events in Splunk, i would like to see the included JSON event; see attached event.
alt text

Below you see a code snippet of the code that generates the event.
alt text

for refence to splunk modular input
https://github.com/splunk/splunk-sdk-python/tree/master/splunklib

Is there anybody who knows how this is suppose to work ?

0 Karma

Splunk Employee
Splunk Employee

You can try passing all the details while initializing the object for Event class.
Ex. event = Event(data="{}", stanza="", time="%.3f" % time.time())

And than use ew.write_event(event)

Note: You'll require to import time module. (import time)

0 Karma

Path Finder

It looks like you're using simple mode for modular input rather than xml:

http://docs.splunk.com/Documentation/Splunk/7.2.0/AdvancedDev/ModInputsStream

This is setup in your introspection in the tag:

http://docs.splunk.com/Documentation/Splunk/7.2.0/AdvancedDev/ModInputsScripts

0 Karma

New Member

Hi, And thanks for you reply. This was one of the parameters that i have been testing out, but to no luck.
I tried to make my test case more simple.

import sys
import os
import six

import json
import time
import datetime

#import splunklib.client as client
# import splunklib
from splunklib.modularinput import Script, Scheme, Argument, EventWriter, Event
FUNCTION_NAME = "testModularInput"


class testModularInput(Script):
    """ Class for importing jira data
    """
    def get_scheme(self):
        scheme = Scheme(self.__class__.__name__)
        scheme.description = self.__class__.__name__
        scheme.use_external_validation = False
        scheme.use_single_instance = True
        scheme.streaming_mode = Scheme.streaming_mode_xml
        username_arg = Argument(
            name="username",
            title="User name",
            data_type=Argument.data_type_string,
            required_on_create=True,
            required_on_edit=True
        )
        scheme.add_argument(username_arg)
        return scheme

    def validate_input(self, validation_definition):
        return None

    def stream_events(self, inputs, ew):
        for input_name, input_item in inputs.inputs.items():
            event = Event()
            event.stanza = input_name
            event.time = datetime.datetime.now().isoformat()+"+0200"
            event.data = "test input of date"
            event.index = "main"
            ew.write_event(event)
        return None

if __name__ == "__main__":
    instance = testModularInput()
    if len(sys.argv) > 1:
        if sys.argv[1] == "--scheme":
            instance.get_scheme()
        elif sys.argv[1] == "--validate-arguments":
            instance.validate_input()
        else:
            instance.usage()
            pass
    else:
        instance.run(sys.argv)
    sys.exit(0)

result in Splunk ( i have added dash to ensure that the xml is not hidden.

<stream>
  <event stanza="testModularInput://testDebug" unbroken="1">
    <time>2018-10-23T08:30:03.869000+0200</time>
    <index>main</index>
    <data>test input of date</data>
    <done />
  </event>
</stream>
0 Karma

New Member

Just for info, I newer got this to work. I found a work around of using print(evemt.data) instead.

0 Karma

New Member

Hi,
A little bit more details, regarding conf files:
inputs.conf
[myModularInput://firsttest]
sourcetype = myModularInput:json

props.conf
[myModularInput:json]
TRUNCATE = 0
INDEXED_EXTRACTIONS = json
KV_MODE=none

no transform

this current setup fails as the data can not be parsed as json ( it is a xml document).

0 Karma

New Member

To clarify my question, how do i get rid of the XML part of the event. It is fine that the values in the xml event is used to modify the time, index, sourcetype, source .... but in Splunk when i do a search i want the data available as json.

script output:

2018-10-22T14:39:24.915000+0200
{'time': '2018-10-22T14:39:24.915000+0200', 'test': 'Issue'}

index event:

2018-10-22T14:39:24.915000+0200
{'time': '2018-10-22T14:39:24.915000+0200', 'test': 'Issue'}

expected event:
{'time': '2018-10-22T14:39:24.915000+0200', 'test': 'Issue'}

0 Karma

New Member

Hi, It looks like the text viewer is interpreting the xml data.

To clarify my question, how do i get rid of the XML part of the event. It is fine that the values in the xml event is used to modify the time, index, sourcetype, source .... but in Splunk when i do a search i want the data available as json.

script output:
stream
event stanza="myModularInput://issuedebug" unbroken="1"
time>2018-10-22T14:55:12.821000+0200OrderedDict([('time', '2018-10-22T14:55:12.821000+0200'), ('test', 'Issue')])2018-10-22T14:55:12.821000+0200OrderedDict([('time', '2018-10-22T14:55:12.821000+0200'), ('test', 'Issue')])

0 Karma
Don’t Miss Global Splunk
User Groups Week!

Free LIVE events worldwide 2/8-2/12
Connect, learn, and collect rad prizes
and swag!