Hi, while waiting for a better solution, let met tell you that you can do it after indexing:
1- after identifying the duplicated event or file.
2-build a query that fetch what you want to remove and pipe it with delete.
3- you can scheduled that search to run periodically.
Now to export event you can use the command dump:
1- you build the query that map the event you want to export.
2- then you pipe like this .....l dump basefilename=MyExport
Note: see all the options for the dump command in the splunk search reference manual.
You can also do an outputcsv.
After that scheduled the search to run each 15min.