All Apps and Add-ons

Splunk DB Connect: How to create a batch input that only loads latest data in the index, removing data from previous runs?

madhav_dholakia
Contributor

Hello There,

I have recently started using the DB Connect App and I want to create a Batch Input with frequency of every 120 minutes.

I have set this up and now it is loading data in the index every 120 minutes, but is there any way with which I can just keep the latest data in the index and remove all the old data added by previous runs of this batch?

So, I want to override the data in index after every batch run and not to append the data.

Any help would be greatly appreciated.

Thank you.

Madhav

Labels (2)
0 Karma

richgalloway
SplunkTrust
SplunkTrust

That's not how Splunk works. Once data is indexed, it stays indexed until it expires.

Can you change the input mode to select only new data?

---
If this reply helps you, Karma would be appreciated.
0 Karma

madhav_dholakia
Contributor

Thank you for the reply. Actually I do not have any date/time column which I can use for raising input. The data I want from DB is a Flag Column with values changing from Yes to No and No to Yes - so thought of using batch input. But in case I create an index with retention policy of 3 or 5 days, and that should help, I believe.
Thank you for your inputs.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Yes, a short retention period will help. You can also use dedup and other SPL commands to eliminate duplicate data from your search results.

---
If this reply helps you, Karma would be appreciated.
0 Karma

madhav_dholakia
Contributor

Thanks, I have created a separate index with 5 days retention period. And yes, I can use dedup but just wanted to keep index as clear as possible, by removing redundant data. Thank you for your help on this.
If you can convert your comment to an answer, I can accept it. Thank you.

0 Karma

tmuth_splunk
Splunk Employee
Splunk Employee

If you're interested, I wrote a dedup example for this exact scenario:
https://github.com/tmuth/splunking-oracle/blob/master/Duplicates/delete-duplicates.spl
to allow you to delete the duplicates. Keep in mind that Splunk doesn't really delete the duplicates, it just marks them as deleted, so you wouldn't save space on disk. The short retention period suggested by Rich solves that. This technique would allow you to search without using the dedup command.

0 Karma
Get Updates on the Splunk Community!

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud  In today’s fast-paced digital ...

Observability protocols to know about

Observability protocols define the specifications or formats for collecting, encoding, transporting, and ...

Take Your Breath Away with Splunk Risk-Based Alerting (RBA)

WATCH NOW!The Splunk Guide to Risk-Based Alerting is here to empower your SOC like never before. Join Haylee ...