Splunk newbie and my first post on this platform!
I have a Splunk Indexer which receives data from a Splunk Forwarder. Because of some performance issues, we have decided to do away with the Historical data.
My data looks in the following way:
User A, Action:Purchase, DateTime:0708201917:52
User A, Action:Purcahse, DataTime:0708201918:00
From the above actions, I would only like to store the latest(0708201918:00) in the indexer.
i.e., I would like to delete(0708201917:52) before storing the newest event.
My initial assumption is to create a Bash Script to trigger the Delete Action, but not sure if it works in Splunk.
OK, so if you are looking for only the latest events then you just might need to use a time function like
latest() to filter out the old events you don't want for the report
<your search here> | stats latest(User) AS User by Action
Sorry, I would like to know why you want to do this? What's the final goal?
Usually companies what to keep a historical record of transactions like the one you described.
Are you trying to make a report and you only want the latest data?
Adding to @richgalloway , you may use
delete command to remove events at search time.
Here is the documentation for that:
The delete command does not reclaim disk space.Removing data is irreversible. If you want to get your data back after the data is deleted, you must re-index the applicable data sources.The delete command can be accessed only by a user with the "delete_by_keyword" capability
Events can certainly be deleted using |delete command after a search, but the user needs delete permissions which is not given by default to any account, no even admin.
This doesn't remove the event from the bucket though, just marks it as unsearchable.
I'm actually not looking to delete using the |delete command. Rather at the index level, befor ethe data is Indexed.
I'm not sure if there is a way to do this or if there is a clever work around.
delete command does not actually delete events - it merely hides them. Replacing events is not a good idea, IMO. At a minimum, doing so means you lose transaction history.
Replacing events prior to indexing requires a pre-processor that caches events until the replacement arrives. Splunk does not do that.