Solved: Re: Is it possible to dedup data during indexing?

tylr · ‎02-19-2011

I'm feeding splunk a large quantity of historical gzipped syslog files for many, many different machines through a single TCP listener input. These archived files almost certainly contain overlapping data. Furthermore, new data may come in that overlaps with the old data. I can filter my search results to not show that duplicated data, but is it possible to strip any duplicate lines at index time?

Stephen_Sorkin · ‎02-19-2011

No, that is not possible.

View solution in original post

ncsantucci · ‎05-23-2014

Similar scenario with logrotate compressing and rotating logs see http://answers.splunk.com/answers/121267/how-does-splunk-handle-nix-logrotate-based-log-rotation

Stephen_Sorkin · ‎02-19-2011

No, that is not possible.

Is it possible to dedup data during indexing?

Enter the Splunk Community Dashboard Challenge for Your Chance to Win!

.conf24 | Session Scheduler is Live!!

Introducing the Splunk Community Dashboard Challenge!