I want to have two heavy forwarders set up to receive the same syslog data at the same time. If one fails the other will carry the load. Both will stream data into one Index with the load spread across multiple Indexer hosts.
What methods can be applied to de-duplicate the same data flowing in from the two forwarders or can you suggest a better option?
You'd be better off solving the failover on the syslog side, have the syslog daemons write a log file, and index that log file. Here's a start in case of rsyslog: http://wiki.rsyslog.com/index.php/FailoverSyslogServer
That way there is no duplication of data in the first place, so no need to deduplicate down the line.