topic Re: How are identical files from multiple (clustered) systems handled? in Getting Data In

How are identical files from multiple (clustered) systems handled?

afx — Mon, 15 Jul 2019 09:49:47 GMT

Hi,
I have an application that logs to a shared clustered file system.
What happens when I install the fowarder (via deployment server and identical configuation) on on each of the nodes to monitor the logs on the this file system?
Do I get duplicates for each of the hosts or can splunk identify that they are dupes even though they come from different hosts?
Would crcsalt help here?
thx
afx

Re: How are identical files from multiple (clustered) systems handled?

richgalloway — Mon, 15 Jul 2019 12:39:02 GMT

The tracking of duplicate input files is done by the individual forwarders. Since each forwarder does not know what other forwarders have processed, you will get duplicates.

Re: How are identical files from multiple (clustered) systems handled?

afx — Mon, 15 Jul 2019 12:49:01 GMT

Drat...
Two ideas:
1: Forcing an identical hostname, would that help the indexer to identify incoming dupes?
2: Using a heavy forwarder inbetween to filter out dupes.
I really want to avoid #2, that would mean I either add additional burden to a box or need a new box.
thx
afx

Re: How are identical files from multiple (clustered) systems handled?

richgalloway — Mon, 15 Jul 2019 15:24:07 GMT

Indexers do not identify dupes. You can do that at search time, however.
An intermediate HF could probably do the time, but it would be a bottleneck and would impair performance. Splunk advises against intermediate forwarders unless absolutely necessary.

What you really should do is avoid having more than one forwarder read a given file.

Re: How are identical files from multiple (clustered) systems handled?

afx — Mon, 15 Jul 2019 15:37:17 GMT

Yup, avaoiding that would be best. I am currently trying to figure out whether the forwarder can be startet / stopped with the application, so there might be some minimal overlap, but overall only one of them is active.