Getting Data In

Scripted Inputs: how do I persist state between invocations of script?

maratc
Engager

I have a script that pulls events from my REST API for Splunk to index. My script runs on schedule.

I want to only pull new events, to prevent duplication and unnecessary traffic. My events have incrementing IDs.

To pull new events I need my script to remember what was the ID of the last pulled event, i.e. my script needs to persist state between runs. If Splunk instance restarts, I too wouldn't like to bring all the events from the beginning.

What are my options here? I would like not to read last ID by issuing query to Splunk.

Thanks!

0 Karma

jkat54
SplunkTrust
SplunkTrust

You can also run a search with curl or other http tools to get the sourcetype=mysource | stats first(ID) by sourcetype, then use the results of this search in your script so that Splunk is your "database" / "config file".

If you do the filesystem file as suggested above, I recommend something like appName/bin/.deltaFile

the . in front of the filename will make it hidden on linux systems. Heres some unoptimized code for creating/reading/updating a delta file which contains the last date of execution,

def getDeltaDate(datapath):
    try:
        if os.path.exists(datapath + '/.delta_date'):
            #open delta file and return date from file
            with open(datapath + '/.delta_date','r') as deltafile:
                for lastrundate in deltafile:
                    return lastrundate
                deltafile.close()
            #write new date to file, overwriting the original
            with open(datapath + '/.delta_date','w') as deltafile:
                deltafile.write(str(date.today()))
            deltafile.close()
        else:
            #write new date to file, original shouldnt exist at this point but will overwrite if so
            with open(datapath + '/.delta_date','w') as deltafile:
                deltafile.write(str(date.today()))
            deltafile.close()
            firstDate = "1970-01-01"
            return firstDate
    except OSError as e:
        logger.critical('Function: getDeltaDate failed due to the following error(s): ' + str(e))
        print('Function: getDeltaDate failed due to the following error(s): ' + str(e))
0 Karma

javiergn
Super Champion

Wouldn't a local file where you store that sort of information work for you?

maratc
Engager

I guess it would; is my script permitted to write to files, and is that a reasonable expectation that the files will stay there between restarts?

0 Karma
Get Updates on the Splunk Community!

New in Observability - Improvements to Custom Metrics SLOs, Log Observer Connect & ...

The latest enhancements to the Splunk observability portfolio deliver improved SLO management accuracy, better ...

Improve Data Pipelines Using Splunk Data Management

  Register Now   This Tech Talk will explore the pipeline management offerings Edge Processor and Ingest ...

3-2-1 Go! How Fast Can You Debug Microservices with Observability Cloud?

Register Join this Tech Talk to learn how unique features like Service Centric Views, Tag Spotlight, and ...