I am looking to resend data to Splunk in the most efficient way. I want to resend data into Splunk with a REST API that will replace old data if it has been updated. I don't want to resend all of the data, only anything that has changed.
The goal is to not give Splunk more data than it needs.
I am searching the data based on a by-minute time range so even in the course of 5-10 minutes, resending all of that would be a lot of data if most of it is repeating events.
I'm very new to all of this so I was looking for some guidance on where to start or helpful links to get started.
You can only do that if you store the data in a
Lookup File in Splunk. If you do this, you would update it like this:
Some search to pull in new data here (could be dbxquery or something else) | some SPL to transform the data and ensure that a distinct key field such as "host" exists | inputlookup append=true YourLookupFIleHere.csv | dedup host | outputlookup YourLookupFIleHere.csv
maybe my words could seem strange: are you sure that you need Splunk?
Splunk works in a different way than a database:
Anyway, the first question is: way you want to have this approach? to save storage or what else?
Anyway, if you want to do this, you could create a summary index populating it every day with all the correct data you want ( https://docs.splunk.com/Documentation/Splunk/8.0.0/Knowledge/Usesummaryindexing ).
I guess I want to filter what data is being sent to Splunk.. For example, I send all the data to Splunk for a 10 minute time span. After I have sent the data to Splunk, a few minutes of data have been replaced with updated new values/data. I only want to resend the new/updated data to Splunk for the few minutes that have been changed.
I want to filter what data is being sent to Splunk because I will waste a lot of GB of data if I resend all of the data from a time span, just to update a few events in Splunk search.