Lets say we have forwarded events that are exactly the same and show in Splunk as duplicates. Running a | dedup _raw
would resolve the duplicate events at search time. Would it make sense to run index=main | deduce _raw | delete
so that we won't have to run a dedup every single time on that time range of events?
I wouldn't advise scheduling a delete. For one, delete is expensive to run. Second, possibly dangerous in that you may wind up deleting something by accident. Third, fix the reason for duplicate events instead.
The reason I ask this is because | delete
would remove the events returned from the prior search. I would assume it would "delete" the duplicate AND the original events. Does anyone know the behavior of this kind of scenario?