Hi,
I am using the Splunk API (with Python) to pull all values of a given record from a given index. I would like to download all data on a regular basis.
What strategy should I use to ensure I get all records without duplication and without missing anything? For instance, in a traditional DB world I might use a unique index value to keep track.
For example I would like to run this every hour to get all records:
"search index=syslog ssh* | table _raw"
Should I use some 'time' condition or is there some sort of count or index I could use?
Thanks
You might be able to achieve what you want by triggering a real time search.
Since you are using the Python you know or are using the python SDK right?
I would probably use the stail.py script as a starting point:
http://dev.splunk.com/view/SP-CAAAEFK#stail
https://github.com/splunk/splunk-sdk-python/blob/master/examples/stail.py
Chris