I have recently started using the DB Connect App and I want to create a Batch Input with frequency of every 120 minutes.
I have set this up and now it is loading data in the index every 120 minutes, but is there any way with which I can just keep the latest data in the index and remove all the old data added by previous runs of this batch?
So, I want to override the data in index after every batch run and not to append the data.
Any help would be greatly appreciated.
That's not how Splunk works. Once data is indexed, it stays indexed until it expires.
Can you change the input mode to select only new data?
Thank you for the reply. Actually I do not have any date/time column which I can use for raising input. The data I want from DB is a Flag Column with values changing from Yes to No and No to Yes - so thought of using batch input. But in case I create an index with retention policy of 3 or 5 days, and that should help, I believe.
Thank you for your inputs.
Yes, a short retention period will help. You can also use
dedup and other SPL commands to eliminate duplicate data from your search results.
Thanks, I have created a separate index with 5 days retention period. And yes, I can use
dedup but just wanted to keep index as clear as possible, by removing redundant data. Thank you for your help on this.
If you can convert your comment to an answer, I can accept it. Thank you.
If you're interested, I wrote a dedup example for this exact scenario:
to allow you to delete the duplicates. Keep in mind that Splunk doesn't really delete the duplicates, it just marks them as deleted, so you wouldn't save space on disk. The short retention period suggested by Rich solves that. This technique would allow you to search without using the dedup command.