After importing several TB of logs, we have discovered that logs were repeated in 8 days worth of our dataset. The data is in a volume of around 2,000,000 events per hour. I'm trying to figure out the fastest way to delete this data so that we can import it again properly. One of the forwarders doing the forwarding during this timeframe was improperly configured so the events have to be deleted not simply deduped from search.
The events are spread across 4 indexers (distributed environment), are in 2 indexes, and come from about 60 different sources. I opened up SSH on each indexer and am trying to run this query:
./splunk search "index=index_name1 source=/opt/splunk/var/spool/splunk/x* earliest=08/18/10:20:16:18 latest=08/19/10:0:0:0 | delete"
This takes forever! It's been running for 1.5 hours now (on each box) to delete just 4 hours of data. Is there any faster way to do this or can I optimize my query for better performance?
(I understand delete is a masking agent and that the events won't be removed until we clean the indexes, that's fine. The events just have to be masked from search results for now and, as I mentioned earlier, can't be deduped from search.)
Technically, | delete
command doesn't delete the events and make space on your system. It will hide the events from the search, It takes so much time when there are more number of events, some times it takes so much time for deleting 100 events as well (my own experience) due to parallel concurrent operations, at that time restart the Splunk and execute the | delete
command again.
If you want to delete (actual delete) the events from the system then Clean
command from Command line interface is the best option. Dealing with this is simple but it will delete entire index in a fraction of seconds but it can't delete individual events under the sourcetype.
for cleaning events
Go to installation dir.
splunk.exe stop
splunk clean eventdata -index yourindexname -f
What Genti says. If you need to delete everything in an index over just some span of time, the fastest way is probably a combination of deleting entire buckets, plus delete. First identify the buckets involved using the |dbinspect
command. Find those that are wholly contained within the time range in question. Stop Splunk, delete the entire bucket folder at the OS level (or move them somewhere out of the way). Then clean up the remaining data (that shares a bucket with data that should not be delete) and use the |delete
commmand.
if the indexes were created only for the purpose of indexing this data (and hence there is no other data in there that is useful) then i think a CLEAN command for these indexes is what is needed.
If you actually need the data in the index (minus the one youre trying to delete) then i think you are out of luck.. these two are the only methods for deleting: Clean or | delete.
As a best practice, if you are going to bring a whole lot of data to splunk, make sure you create new indexes for it so that if something goes bad you can always clean these indexes without having to worry about the good data there...
Can anyone shed light on why |delete is so slow and whether there is any way to increase performance?
Good question. I don't think there is a faster way that |delete
, and your search probably can't be optimized anymore than it is. The other option I know of is at the bucket level, the can be can be dumped, unwanted events filtered out, and then rebuild... but that process would probably be significantly longer. And, if you've already started a delete operation, I wouldn't suggest killing it. I have no idea what that would do, but it doesn't seem like a good idea.