Developing for Splunk Enterprise

How do I dynamically update the parameters of a Python modular input?

Path Finder

Hi all,

I've got a Python modular input script that needs to persist a variable between runs. I'm trying to use the update method of the input to change this variable from my stream_events function, but the documentation is unclear as to how I should do this. I know it's possible using the Java SDK as demonstrated by the Carvoyant modular input, but I don't see any examples in Python.

Here's what my code looks like now (doesn't throw any errors but also doesn't update the input).

class CloudflareInput(Script):
    def get_scheme(self):
        scheme = Scheme('Cloudflare Log Modular Input')
        scheme.description = 'Grab log events from a Cloudflare log server.'

        last_fetch_filename = Argument('last_log')
        last_fetch_filename.description = ("Name of the last file fetched "
                                           "from the server. Leave blank to "
                                           "fetch all available logs. This "
                                           "value will be updated "
                                           "automatically by the script as "
                                           "it runs so that we don't fetch "
                                           "duplicate logs.")
        last_fetch_filename.data_type = Argument.data_type_string
        last_fetch_filename.required_on_create = False
        last_fetch_filename.required_on_edit = False
        scheme.add_argument(last_fetch_filename)

        return scheme

    def stream_events(self, inputs, ew):
        for input_name, input_item in inputs.inputs.iteritems():
            input_item.update({'last_log': str(datetime.now())})

if __name__ == "__main__":
    sys.exit(CloudflareInput().run(sys.argv))

How can I make this update my input? Thanks for your help.

0 Karma

Splunk Employee
Splunk Employee

So I recently wrote a modular input that needed to record its status between runs.
The way I tackled it was to work out the checkpoint directory then create a file in that directory to store its current status (in stream_events)

checkpoint_directory = str(self._input_definition.metadata['checkpoint_dir']) 
timestamp_progress_file = os.path.join(checkpoint_directory, self.stanza + '_last_timestamp.txt')

This file is then read/updated like any other file accessed through Python.

0 Karma

Path Finder

OK, I've discovered you have to call update on one of the service's inputs, not on a member of the inputs list passed to the stream_events function. So this works:

class CloudflareInput(Script):
    def get_scheme(self):
        scheme = Scheme('Cloudflare Log Modular Input')
        scheme.description = 'Grab log events from a Cloudflare log server.'

        last_fetch_filename = Argument('last_log')
        last_fetch_filename.description = ("Name of the last file fetched "
                                           "from the server. Leave blank to "
                                           "fetch all available logs. This "
                                           "value will be updated "
                                           "automatically by the script as "
                                           "it runs so that we don't fetch "
                                           "duplicate logs.")
        last_fetch_filename.data_type = Argument.data_type_string
        last_fetch_filename.required_on_create = False
        last_fetch_filename.required_on_edit = False
        scheme.add_argument(last_fetch_filename)

        return scheme

    def stream_events(self, inputs, ew):
        serv_inputs = self.service.inputs
        for input_item in serv_inputs:
            if ("cloudflare://"+input_item.name) in inputs.inputs:
                input_item.update(last_log=str(datetime.now()))

if __name__ == "__main__":
    sys.exit(CloudflareInput().run(sys.argv))

However, calling "update" makes the modular input run again immediately, regardless of the time interval specified. Is there a way around this?

0 Karma

Path Finder

Were you able to successfully bring in CloudFlare logs?

0 Karma

Path Finder

Yes, we were. Are you facing a similar issue?

0 Karma

Path Finder

Yes, right now I'm using the api.cloudflare.com to index logs based on a range of times (grab last 5 minutes of data) and save into a json file that Splunk indexes. This has become unreliable.

Is your code leveraging the Python CloudFlare API modules?

0 Karma

Path Finder

We don't use the API to get our logs. We retrieve them via SFTP (using Paramiko) from logs.cloudflare.com.

I'm guessing this process depends on the service plan you have with CloudFlare, but I don't really know.

0 Karma