I have a source as ///application.log in my inputs.conf.On the servers the application.log will be rolled when it fill up with 10Mb by creating the file name as application.12-13-2014.log and new file with application.log will be creating after rolling . At some point because of this roll over we are missing some events in splunk. So in order to not to miss the events we changed the source as application* (used wild card) in inputs.conf and now we see all the logs getting indexed and showing events in search. But the problem is that we are getting duplicate logs with the same time stamp. The duplicate logs appears with the source as one with application.log and the other with application.12-12-2014.log. So can anyone help me on this issue. Thanks in advance !
Is there any way of writing props or transforms for the source to drop the duplicate events. I am not sure about the whitelist you are saying ? Can you give an example ?
No but you can schedule a search to run hourly like this which removes duplicates from search results:
index=YourIndexHere sourcetype=YourSourcetypeHere earliest=-5h latest=now
| streamstats count AS deleteme BY _raw
| search deleteme>1
| delete
My index name is Application and Sourcetype is prod_application . Can you just write the search for 1 hour ? One more thing the duplicates will be deleted at the search level or indexer level ?