Splunk Search

Deleting historical logs

Joey3848
Observer

Is there a commonly accepted most efficient method of deleting logs? Occasionally I'll have a use case for deleting logs dating back to a year ago from some pretty noisy indexes which results in me being stuck running deletion queries for hours on end.

Those queries also have a tendency to timeout forcing me to restart the search entirely (running the commands as an admin user already). This is happening using a general format as seen below 

<query>
| delete


Is there a more efficient method to delete events or is this the best we have currently?

Labels (1)
0 Karma

PickleRick
SplunkTrust
SplunkTrust

The delete command doesn't actually delete anything from the indexes. Oversimplifying a bit, it only marks some data as unsearchable. But the data is still there so for example for compliance purposes it's usually not OK. It  might incur some performance hit as well since Splunk has to decide whether to return a result to you or not after searching the data. So you're searching across much more data than you should only to discard a (possibly significant) part of the results.

The proper way of handling your data is to onboard it properly and not index unwanted data. Yes, I know it requires work beforehand but that's how it should be done. Essentially, whatever is indexed, stays in the index until it's rolled out to frozen.

0 Karma

kiran_panchavat
Champion

@Joey3848If you’re removing “everything older than X”, set time-based retention on the index and let Splunk expire those buckets automatically.

 

frozenTimePeriodInSecs = <nonnegative integer>
* The number of seconds after which indexed data rolls to frozen.
* If you do not specify a 'coldToFrozenScript', data is deleted when rolled to
  frozen.
* NOTE: Every event in a bucket must be older than 'frozenTimePeriodInSecs'
  seconds before the bucket rolls to frozen.
* The highest legal value is 4294967295.
* Default: 188697600 (6 years)

[noisy_index]
# keep ~365 days
frozenTimePeriodInSecs = 31536000

NOTE: The delete command does not reclaim disk space.

Did this help? If yes, please consider giving kudos, marking it as the solution, or commenting for clarification — your feedback keeps the community going!
0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Rather "deleting" logs, (which only marks the events as unsearchable), you might be better off setting the retention time on your indexes so that the data is "naturally" expired.

0 Karma

Joey3848
Observer

Unfortunately it's not all logs older than X date, its more in the situation of data being logged that shouldn't have been due to company policy.

For example, a specific log message might be found to log sensitive data, so we write our search to find explicitly those events, and remove them with the delete command. It's understood that the delete command doesn't actually fully delete the events from Splunk, but for our use case it's sufficient to mark them as unsearchable

0 Karma

isoutamo
SplunkTrust
SplunkTrust

As already said those "deleted" events are still in disk and actually if you have access to server cli, you could make those readable again. Basically this means that your delete solutions is not 100% governance compliant. 

The only way to get rid of those is delete all data from those indexes and then reindex other events.

Normal way in data onboarding is that you make everything 1st in test/dev environment and ensure that there is no PII or something other unwanted events. After that deploy those configurations into production.

But if you still want to use delete, my suggestion is that use enough small time spans and run it several times instead of running all in one.

 

0 Karma

kiran_panchavat
Champion

@Joey3848  You can send unwanted incoming events to nullQueue to discard them during data routing and filtering. 

https://docs.splunk.com/Documentation/Splunk/latest/Forwarding/Routeandfilterdatad 

Did this help? If yes, please consider giving kudos, marking it as the solution, or commenting for clarification — your feedback keeps the community going!
0 Karma

Joey3848
Observer

That's good to know, however I'm more interested in a retroactive delete rather than proactive. We currently have pipelines to remove the sensitive logging upstream of Splunk, so I'm more just trying to make our response process more efficient.

As I mentioned, using the delete command can take pretty long to run given we have a pretty large infrastructure, so I'm hoping there's some alternative I haven't found yet to delete specific events in a more efficient manner

0 Karma

livehybrid
SplunkTrust
SplunkTrust

Hi @Joey3848 

As others have mentioned, the delete command only stops the data being rendered by searches, the physical data does actually still exist - Im not sure if this would resolve your compliance issue or not? It sounds like aging out the data by setting the frozenTimePeriodInSecs will also not help you as you want to keep the other data.

The only other thing I can think of is collect the "good" data into a new index, although this could take quite a lot of time and resource if the data is large and comes with its own heap of caveats, risks and issues!

How much data are we talking here?

🌟 Did this answer help you? If so, please consider:

  • Adding karma to show it was useful
  • Marking it as the solution if it resolved your issue
  • Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing

Joey3848
Observer

Hey @livehybrid 

Unfortunately sensitive data can occasionally be found in multiple indexes so moving around entire indexes isn't great. I've toyed around with the collect command moving the sensitive data into a designated low retention index, but from my tests it just creates a copy of the event rather than moving it over completely (unless I was using it incorrectly?).

The data needing to be deleted / hidden has pretty wide ranges from a handful of events to millions of events depending on how quickly sensitive log messages are caught. Efficiency is more of an issue with large numbers of events over a long period of time due to the size of the indexes (can be multiple terabytes depending on the day, another issue in it of itself).

With how large the indexes are I fully recognize that I'm likely pushing Splunk to its limits and long search times are inevitable, 

0 Karma

PickleRick
SplunkTrust
SplunkTrust

Unfortunately sensitive data can occasionally be found in multiple indexes

To be fully honest - that means that data onboarding wasn't done properly. You can try to fix that now or suffer now trying to find some walkarounds and do it anyway properly later. And - more importantly - that might mean that you don't manage your data properly before ingesting it to Splunk. And this might (depending on your business and compliance requirements) bite you in posterior severely.

0 Karma

isoutamo
SplunkTrust
SplunkTrust
Exactly that way. You must do data onboarding in test/dev environment and ensure that your splunk shows only what you want!

Normally this means that you must have 1st your own dev node etc. where you could create inputs, props and transforms. Then install those in some kind of systest or integration test environment where you could validate your configurations.

If you don't have this kind of environments then you have some other issues in audit and governance cases than only PII data in production splunk.
0 Karma

richgalloway
SplunkTrust
SplunkTrust

You are correct in that the collect command duplicates events.  There is no way to move an event from one index to another.

The recommended (but painful) way to delete events is:

  1. Use collect to copy the desired events to a temporary index
  2. Delete the old index or set frozenTimePeriodInSecs = 1 
  3. Use collect to copy the desired events from the temporary index back to the old one
  4. Delete the temporary index

You can avoid steps 3 and 4 if you can change your KOs to use the new index name in place of the old one.  This should be easy if you have a macro for the index name.

---
If this reply helps you, Karma would be appreciated.
0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

.conf25 Global Broadcast: Don’t Miss a Moment

Hello Splunkers, .conf25 is only a click away.  Not able to make it to .conf25 in person? No worries, you can ...

Observe and Secure All Apps with Splunk

 Join Us for Our Next Tech Talk: Observe and Secure All Apps with SplunkAs organizations continue to innovate ...

What's New in Splunk Observability - August 2025

What's New We are excited to announce the latest enhancements to Splunk Observability Cloud as well as what is ...