Is there a commonly accepted most efficient method of deleting logs? Occasionally I'll have a use case for deleting logs dating back to a year ago from some pretty noisy indexes which results in me being stuck running deletion queries for hours on end.
Those queries also have a tendency to timeout forcing me to restart the search entirely (running the commands as an admin user already). This is happening using a general format as seen below
<query>
| delete
Is there a more efficient method to delete events or is this the best we have currently?
The delete command doesn't actually delete anything from the indexes. Oversimplifying a bit, it only marks some data as unsearchable. But the data is still there so for example for compliance purposes it's usually not OK. It might incur some performance hit as well since Splunk has to decide whether to return a result to you or not after searching the data. So you're searching across much more data than you should only to discard a (possibly significant) part of the results.
The proper way of handling your data is to onboard it properly and not index unwanted data. Yes, I know it requires work beforehand but that's how it should be done. Essentially, whatever is indexed, stays in the index until it's rolled out to frozen.
@Joey3848If you’re removing “everything older than X”, set time-based retention on the index and let Splunk expire those buckets automatically.
frozenTimePeriodInSecs = <nonnegative integer> * The number of seconds after which indexed data rolls to frozen. * If you do not specify a 'coldToFrozenScript', data is deleted when rolled to frozen. * NOTE: Every event in a bucket must be older than 'frozenTimePeriodInSecs' seconds before the bucket rolls to frozen. * The highest legal value is 4294967295. * Default: 188697600 (6 years)
[noisy_index]
# keep ~365 days
frozenTimePeriodInSecs = 31536000
NOTE: The delete command does not reclaim disk space.
Rather "deleting" logs, (which only marks the events as unsearchable), you might be better off setting the retention time on your indexes so that the data is "naturally" expired.
Unfortunately it's not all logs older than X date, its more in the situation of data being logged that shouldn't have been due to company policy.
For example, a specific log message might be found to log sensitive data, so we write our search to find explicitly those events, and remove them with the delete command. It's understood that the delete command doesn't actually fully delete the events from Splunk, but for our use case it's sufficient to mark them as unsearchable
As already said those "deleted" events are still in disk and actually if you have access to server cli, you could make those readable again. Basically this means that your delete solutions is not 100% governance compliant.
The only way to get rid of those is delete all data from those indexes and then reindex other events.
Normal way in data onboarding is that you make everything 1st in test/dev environment and ensure that there is no PII or something other unwanted events. After that deploy those configurations into production.
But if you still want to use delete, my suggestion is that use enough small time spans and run it several times instead of running all in one.
@Joey3848 You can send unwanted incoming events to nullQueue to discard them during data routing and filtering.
https://docs.splunk.com/Documentation/Splunk/latest/Forwarding/Routeandfilterdatad
That's good to know, however I'm more interested in a retroactive delete rather than proactive. We currently have pipelines to remove the sensitive logging upstream of Splunk, so I'm more just trying to make our response process more efficient.
As I mentioned, using the delete command can take pretty long to run given we have a pretty large infrastructure, so I'm hoping there's some alternative I haven't found yet to delete specific events in a more efficient manner
Hi @Joey3848
As others have mentioned, the delete command only stops the data being rendered by searches, the physical data does actually still exist - Im not sure if this would resolve your compliance issue or not? It sounds like aging out the data by setting the frozenTimePeriodInSecs will also not help you as you want to keep the other data.
The only other thing I can think of is collect the "good" data into a new index, although this could take quite a lot of time and resource if the data is large and comes with its own heap of caveats, risks and issues!
How much data are we talking here?
🌟 Did this answer help you? If so, please consider:
Your feedback encourages the volunteers in this community to continue contributing
Hey @livehybrid
Unfortunately sensitive data can occasionally be found in multiple indexes so moving around entire indexes isn't great. I've toyed around with the collect command moving the sensitive data into a designated low retention index, but from my tests it just creates a copy of the event rather than moving it over completely (unless I was using it incorrectly?).
The data needing to be deleted / hidden has pretty wide ranges from a handful of events to millions of events depending on how quickly sensitive log messages are caught. Efficiency is more of an issue with large numbers of events over a long period of time due to the size of the indexes (can be multiple terabytes depending on the day, another issue in it of itself).
With how large the indexes are I fully recognize that I'm likely pushing Splunk to its limits and long search times are inevitable,
Unfortunately sensitive data can occasionally be found in multiple indexes
To be fully honest - that means that data onboarding wasn't done properly. You can try to fix that now or suffer now trying to find some walkarounds and do it anyway properly later. And - more importantly - that might mean that you don't manage your data properly before ingesting it to Splunk. And this might (depending on your business and compliance requirements) bite you in posterior severely.
You are correct in that the collect command duplicates events. There is no way to move an event from one index to another.
The recommended (but painful) way to delete events is:
You can avoid steps 3 and 4 if you can change your KOs to use the new index name in place of the old one. This should be easy if you have a macro for the index name.