Getting Data In

How do I go about the parsing of Splunk modular input with JSON data?

AndersNierhoff
New Member

Hi Splunk community,

I am facing some issue in using the Splunk modular input.

The modular input is built around events and event writers that are constructing an xml event. But, when i search the events in Splunk, i would like to see the included JSON event; see attached event.
alt text

Below you see a code snippet of the code that generates the event.
alt text

for refence to splunk modular input
https://github.com/splunk/splunk-sdk-python/tree/master/splunklib

Is there anybody who knows how this is suppose to work ?

0 Karma

rshah_splunk
Splunk Employee
Splunk Employee

You can try passing all the details while initializing the object for Event class.
Ex. event = Event(data="{}", stanza="", time="%.3f" % time.time())

And than use ew.write_event(event)

Note: You'll require to import time module. (import time)

0 Karma

coccyx
Path Finder

It looks like you're using simple mode for modular input rather than xml:

http://docs.splunk.com/Documentation/Splunk/7.2.0/AdvancedDev/ModInputsStream

This is setup in your introspection in the tag:

http://docs.splunk.com/Documentation/Splunk/7.2.0/AdvancedDev/ModInputsScripts

0 Karma

AndersNierhoff
New Member

Hi, And thanks for you reply. This was one of the parameters that i have been testing out, but to no luck.
I tried to make my test case more simple.

import sys
import os
import six

import json
import time
import datetime

#import splunklib.client as client
# import splunklib
from splunklib.modularinput import Script, Scheme, Argument, EventWriter, Event
FUNCTION_NAME = "testModularInput"


class testModularInput(Script):
    """ Class for importing jira data
    """
    def get_scheme(self):
        scheme = Scheme(self.__class__.__name__)
        scheme.description = self.__class__.__name__
        scheme.use_external_validation = False
        scheme.use_single_instance = True
        scheme.streaming_mode = Scheme.streaming_mode_xml
        username_arg = Argument(
            name="username",
            title="User name",
            data_type=Argument.data_type_string,
            required_on_create=True,
            required_on_edit=True
        )
        scheme.add_argument(username_arg)
        return scheme

    def validate_input(self, validation_definition):
        return None

    def stream_events(self, inputs, ew):
        for input_name, input_item in inputs.inputs.items():
            event = Event()
            event.stanza = input_name
            event.time = datetime.datetime.now().isoformat()+"+0200"
            event.data = "test input of date"
            event.index = "main"
            ew.write_event(event)
        return None

if __name__ == "__main__":
    instance = testModularInput()
    if len(sys.argv) > 1:
        if sys.argv[1] == "--scheme":
            instance.get_scheme()
        elif sys.argv[1] == "--validate-arguments":
            instance.validate_input()
        else:
            instance.usage()
            pass
    else:
        instance.run(sys.argv)
    sys.exit(0)

result in Splunk ( i have added dash to ensure that the xml is not hidden.

<stream>
  <event stanza="testModularInput://testDebug" unbroken="1">
    <time>2018-10-23T08:30:03.869000+0200</time>
    <index>main</index>
    <data>test input of date</data>
    <done />
  </event>
</stream>
0 Karma

AndersNierhoff
New Member

Just for info, I newer got this to work. I found a work around of using print(evemt.data) instead.

0 Karma

AndersNierhoff
New Member

Hi,
A little bit more details, regarding conf files:
inputs.conf
[myModularInput://firsttest]
sourcetype = myModularInput:json

props.conf
[myModularInput:json]
TRUNCATE = 0
INDEXED_EXTRACTIONS = json
KV_MODE=none

no transform

this current setup fails as the data can not be parsed as json ( it is a xml document).

0 Karma

AndersNierhoff
New Member

To clarify my question, how do i get rid of the XML part of the event. It is fine that the values in the xml event is used to modify the time, index, sourcetype, source .... but in Splunk when i do a search i want the data available as json.

script output:

2018-10-22T14:39:24.915000+0200
{'time': '2018-10-22T14:39:24.915000+0200', 'test': 'Issue'}

index event:

2018-10-22T14:39:24.915000+0200
{'time': '2018-10-22T14:39:24.915000+0200', 'test': 'Issue'}

expected event:
{'time': '2018-10-22T14:39:24.915000+0200', 'test': 'Issue'}

0 Karma

AndersNierhoff
New Member

Hi, It looks like the text viewer is interpreting the xml data.

To clarify my question, how do i get rid of the XML part of the event. It is fine that the values in the xml event is used to modify the time, index, sourcetype, source .... but in Splunk when i do a search i want the data available as json.

script output:
stream
event stanza="myModularInput://issuedebug" unbroken="1"
time>2018-10-22T14:55:12.821000+0200OrderedDict([('time', '2018-10-22T14:55:12.821000+0200'), ('test', 'Issue')])2018-10-22T14:55:12.821000+0200OrderedDict([('time', '2018-10-22T14:55:12.821000+0200'), ('test', 'Issue')])

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Index This | What travels the world but is also stuck in place?

April 2026 Edition  Hayyy Splunk Education Enthusiasts and the Eternally Curious!   We’re back with this ...

Discover New Use Cases: Unlock Greater Value from Your Existing Splunk Data

Realizing the full potential of your Splunk investment requires more than just understanding current usage; it ...

Continue Your Journey: Join Session 2 of the Data Management and Federation Bootcamp ...

As data volumes continue to grow and environments become more distributed, managing and optimizing data ...