How do I dynamically update the parameters of a Py...

sjodle · ‎12-04-2015

Hi all,

I've got a Python modular input script that needs to persist a variable between runs. I'm trying to use the update method of the input to change this variable from my stream_events function, but the documentation is unclear as to how I should do this. I know it's possible using the Java SDK as demonstrated by the Carvoyant modular input, but I don't see any examples in Python.

Here's what my code looks like now (doesn't throw any errors but also doesn't update the input).

class CloudflareInput(Script):
    def get_scheme(self):
        scheme = Scheme('Cloudflare Log Modular Input')
        scheme.description = 'Grab log events from a Cloudflare log server.'

        last_fetch_filename = Argument('last_log')
        last_fetch_filename.description = ("Name of the last file fetched "
                                           "from the server. Leave blank to "
                                           "fetch all available logs. This "
                                           "value will be updated "
                                           "automatically by the script as "
                                           "it runs so that we don't fetch "
                                           "duplicate logs.")
        last_fetch_filename.data_type = Argument.data_type_string
        last_fetch_filename.required_on_create = False
        last_fetch_filename.required_on_edit = False
        scheme.add_argument(last_fetch_filename)

        return scheme

    def stream_events(self, inputs, ew):
        for input_name, input_item in inputs.inputs.iteritems():
            input_item.update({'last_log': str(datetime.now())})

if __name__ == "__main__":
    sys.exit(CloudflareInput().run(sys.argv))

How can I make this update my input? Thanks for your help.

msivill_splunk · ‎10-18-2016

So I recently wrote a modular input that needed to record its status between runs.
The way I tackled it was to work out the checkpoint directory then create a file in that directory to store its current status (in stream_events)

checkpoint_directory = str(self._input_definition.metadata['checkpoint_dir']) 
timestamp_progress_file = os.path.join(checkpoint_directory, self.stanza + '_last_timestamp.txt')

This file is then read/updated like any other file accessed through Python.

sjodle · ‎12-04-2015

OK, I've discovered you have to call update on one of the service's inputs, not on a member of the inputs list passed to the stream_events function. So this works:

class CloudflareInput(Script):
    def get_scheme(self):
        scheme = Scheme('Cloudflare Log Modular Input')
        scheme.description = 'Grab log events from a Cloudflare log server.'

        last_fetch_filename = Argument('last_log')
        last_fetch_filename.description = ("Name of the last file fetched "
                                           "from the server. Leave blank to "
                                           "fetch all available logs. This "
                                           "value will be updated "
                                           "automatically by the script as "
                                           "it runs so that we don't fetch "
                                           "duplicate logs.")
        last_fetch_filename.data_type = Argument.data_type_string
        last_fetch_filename.required_on_create = False
        last_fetch_filename.required_on_edit = False
        scheme.add_argument(last_fetch_filename)

        return scheme

    def stream_events(self, inputs, ew):
        serv_inputs = self.service.inputs
        for input_item in serv_inputs:
            if ("cloudflare://"+input_item.name) in inputs.inputs:
                input_item.update(last_log=str(datetime.now()))

if __name__ == "__main__":
    sys.exit(CloudflareInput().run(sys.argv))

However, calling "update" makes the modular input run again immediately, regardless of the time interval specified. Is there a way around this?

cyndiback · ‎10-16-2016

Were you able to successfully bring in CloudFlare logs?

sjodle · ‎10-17-2016

Yes, we were. Are you facing a similar issue?

cyndiback · ‎10-20-2016

Yes, right now I'm using the api.cloudflare.com to index logs based on a range of times (grab last 5 minutes of data) and save into a json file that Splunk indexes. This has become unreliable.

Is your code leveraging the Python CloudFlare API modules?

sjodle · ‎10-21-2016

We don't use the API to get our logs. We retrieve them via SFTP (using Paramiko) from logs.cloudflare.com.

I'm guessing this process depends on the service plan you have with CloudFlare, but I don't really know.

How do I dynamically update the parameters of a Python modular input?

Why You Can't Miss .conf25: Unleashing the Power of Agentic AI with Splunk & Cisco

Deep Dive into Federated Analytics: Unlocking the Full Power of Your Security Data

Your summer travels continue with new course releases

Are you a member of the Splunk Community?

How do I dynamically update the parameters of a Python modular input?

Why You Can't Miss .conf25: Unleashing the Power of Agentic AI with Splunk & Cisco

Deep Dive into Federated Analytics: Unlocking the Full Power of Your Security Data

Your summer travels continue with new course releases