Splunk Dev

How to avoid duplication of events for multiple modular inputs?

umairahmad3985
Path Finder

Dear All,

I have created a Python modular input (of multiple instance type) using Splunk's Add-on builder that polls a REST API and pulls JSON data for indexing into Splunk. The parameters of the API are start and end timestamps, for which the data is required. In order to avoid duplication, I am keeping the last_polled time as a checkpoint in my modular input so that on the next execution, the script knows from where to start fetching the data. This works great when the user creates only one input from the modular input but if the user creates another input to ingest the data in a separate other index, the script will be fetching the last_polled time from the first input as checkpoints are shared within a modular input so it will miss some data if their intervals are not the same.

Is there any technique to isolate checkpoints for each input so that they are not shared between them? Ideally, I would want them to be isolated according to the index and sourcetype defined by the user.

I hope I was able to clear my requirement clearly, let me know if you need more information on this. Will be very happy to receive some direction on this as the documentation has little information.

Regards,
Umair

Labels (1)
0 Karma

davidoff96
Path Finder

Extremely late to the party, but I was also having this same issue. Figured I'd answer here in case anyone else was wondering. I was thinking of trying to get the input name and appending it to the checkpoint. This, of course, cannot be found in the documentation of the AOB from what I've seen. Here's what I've ended up doing.

If you call "helper.get_input_stanza()", it will give you a dictionary of a single dictionary. The key for that dictionary is your inputs name. So you could just do a "list(helper.get_input_stanza().keys())[0]". That dictionary looks like this:

davidoff96_0-1666790454284.png

HOWEVER! This does not look the same in the AOB. You should still be able to use that .keys()[0] in the code, but the output will look different. 

davidoff96_1-1666790548351.png

So I guess be forewarned if you want to try to access name/disabled in your code. Hopefully this helps, and I will be using this in my custom add-ons now as well too. I now use this function:

def get_input_name(helper):
    return list(helper.get_input_stanza().keys())[0]


Another thing to note that is if you change your input's name, it will lose that checkpoint value. So there may be duplication there.

Get Updates on the Splunk Community!

Monitoring Postgres with OpenTelemetry

Behind every business-critical application, you’ll find databases. These behind-the-scenes stores power ...

Mastering Synthetic Browser Testing: Pro Tips to Keep Your Web App Running Smoothly

To start, if you're new to synthetic monitoring, I recommend exploring this synthetic monitoring overview. In ...

Splunk Edge Processor | Popular Use Cases to Get Started with Edge Processor

Splunk Edge Processor offers more efficient, flexible data transformation – helping you reduce noise, control ...