Getting Data In

How do I go about the parsing of Splunk modular input with JSON data?

AndersNierhoff
New Member

Hi Splunk community,

I am facing some issue in using the Splunk modular input.

The modular input is built around events and event writers that are constructing an xml event. But, when i search the events in Splunk, i would like to see the included JSON event; see attached event.
alt text

Below you see a code snippet of the code that generates the event.
alt text

for refence to splunk modular input
https://github.com/splunk/splunk-sdk-python/tree/master/splunklib

Is there anybody who knows how this is suppose to work ?

0 Karma

rshah_splunk
Splunk Employee
Splunk Employee

You can try passing all the details while initializing the object for Event class.
Ex. event = Event(data="{}", stanza="", time="%.3f" % time.time())

And than use ew.write_event(event)

Note: You'll require to import time module. (import time)

0 Karma

coccyx
Path Finder

It looks like you're using simple mode for modular input rather than xml:

http://docs.splunk.com/Documentation/Splunk/7.2.0/AdvancedDev/ModInputsStream

This is setup in your introspection in the tag:

http://docs.splunk.com/Documentation/Splunk/7.2.0/AdvancedDev/ModInputsScripts

0 Karma

AndersNierhoff
New Member

Hi, And thanks for you reply. This was one of the parameters that i have been testing out, but to no luck.
I tried to make my test case more simple.

import sys
import os
import six

import json
import time
import datetime

#import splunklib.client as client
# import splunklib
from splunklib.modularinput import Script, Scheme, Argument, EventWriter, Event
FUNCTION_NAME = "testModularInput"


class testModularInput(Script):
    """ Class for importing jira data
    """
    def get_scheme(self):
        scheme = Scheme(self.__class__.__name__)
        scheme.description = self.__class__.__name__
        scheme.use_external_validation = False
        scheme.use_single_instance = True
        scheme.streaming_mode = Scheme.streaming_mode_xml
        username_arg = Argument(
            name="username",
            title="User name",
            data_type=Argument.data_type_string,
            required_on_create=True,
            required_on_edit=True
        )
        scheme.add_argument(username_arg)
        return scheme

    def validate_input(self, validation_definition):
        return None

    def stream_events(self, inputs, ew):
        for input_name, input_item in inputs.inputs.items():
            event = Event()
            event.stanza = input_name
            event.time = datetime.datetime.now().isoformat()+"+0200"
            event.data = "test input of date"
            event.index = "main"
            ew.write_event(event)
        return None

if __name__ == "__main__":
    instance = testModularInput()
    if len(sys.argv) > 1:
        if sys.argv[1] == "--scheme":
            instance.get_scheme()
        elif sys.argv[1] == "--validate-arguments":
            instance.validate_input()
        else:
            instance.usage()
            pass
    else:
        instance.run(sys.argv)
    sys.exit(0)

result in Splunk ( i have added dash to ensure that the xml is not hidden.

<stream>
  <event stanza="testModularInput://testDebug" unbroken="1">
    <time>2018-10-23T08:30:03.869000+0200</time>
    <index>main</index>
    <data>test input of date</data>
    <done />
  </event>
</stream>
0 Karma

AndersNierhoff
New Member

Just for info, I newer got this to work. I found a work around of using print(evemt.data) instead.

0 Karma

AndersNierhoff
New Member

Hi,
A little bit more details, regarding conf files:
inputs.conf
[myModularInput://firsttest]
sourcetype = myModularInput:json

props.conf
[myModularInput:json]
TRUNCATE = 0
INDEXED_EXTRACTIONS = json
KV_MODE=none

no transform

this current setup fails as the data can not be parsed as json ( it is a xml document).

0 Karma

AndersNierhoff
New Member

To clarify my question, how do i get rid of the XML part of the event. It is fine that the values in the xml event is used to modify the time, index, sourcetype, source .... but in Splunk when i do a search i want the data available as json.

script output:

2018-10-22T14:39:24.915000+0200
{'time': '2018-10-22T14:39:24.915000+0200', 'test': 'Issue'}

index event:

2018-10-22T14:39:24.915000+0200
{'time': '2018-10-22T14:39:24.915000+0200', 'test': 'Issue'}

expected event:
{'time': '2018-10-22T14:39:24.915000+0200', 'test': 'Issue'}

0 Karma

AndersNierhoff
New Member

Hi, It looks like the text viewer is interpreting the xml data.

To clarify my question, how do i get rid of the XML part of the event. It is fine that the values in the xml event is used to modify the time, index, sourcetype, source .... but in Splunk when i do a search i want the data available as json.

script output:
stream
event stanza="myModularInput://issuedebug" unbroken="1"
time>2018-10-22T14:55:12.821000+0200OrderedDict([('time', '2018-10-22T14:55:12.821000+0200'), ('test', 'Issue')])2018-10-22T14:55:12.821000+0200OrderedDict([('time', '2018-10-22T14:55:12.821000+0200'), ('test', 'Issue')])

0 Karma
Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...