The UF has no way of knowing what is a duplicate and what is not, especially if the duplication occurs across instances of an input file.
The best way to deal with duplicate records is to prevent them occurring. Duplicate events in Splunk consume license quota and storage so, even though there are ways to ignore dups at search time, they still bear a cost. Adjust your log collection process to avoid duplicate data as much as possible.
Hello Rich
Thank you very much for the advising. Is there a way I could do the logging collecting adjustment on the Universal Forwarder? I was wondering if I could make it ignore the duplicates before sending to Splunk Cloud.
Thank you.
The UF has no way of knowing what is a duplicate and what is not, especially if the duplication occurs across instances of an input file.