Solved: Avoiding double ingestion

Jtorge · ‎02-18-2026

I have a RHEL admin who is building two syslog servers to ingest data from one RHEL node redundantly. These two syslog servers will forward their data to two Network File Storage (NFS) mounts. To meet security requirements, we need to forward the data from both of those NFS mounts to our indexers. I am worried about indexing duplicate data and using twice as much license as needed for that data. Is there a way to index the data from one NFS at a time and if it crashes/fails, have a forwarder ready to automatically cutover and continue sending data to be indexed? Please advise.

PickleRick · ‎02-18-2026

Splunk on its own does not have any deduplication functionality. And two different sources are... well, just two independent sources so there is no built-in kind of input which would treat them as one.

You could try implementing your own modular/scripted input which would keep track of the state of each of the NFS-mounted files but that would mean you'd need to reimplement the monitor input with extra steps.

I'm also not sure what you mean by "two syslog servers to ingest data from RHEL node".

What is it you're trying to achieve here that cannot be achieved with - for example - useACK and sufficiently big permanent queue?

View solution in original post

Jtorge · ‎02-23-2026

Rick, as of right now, I don't have any more information about the mysterious two syslog servers. An admin reached out with the scenario, and I couldn't think of any way to prevent duplication from a situation like this. My mind went straight to a custom-scripted input as you described, but I wanted to know if there was a simpler solution. Once the admin sets up what they've described, and I have a more realistic view of the project, I can update this with better specifics. Thank you for the advice.

richgalloway · ‎02-18-2026

If the goal is redundancy, then the typical solution is to put a load balancer in front of two syslog servers and let each syslog server forward data as it is received.

---
If this reply helps you, Karma would be appreciated.

Jtorge · ‎02-23-2026

Thank you, I'll certainly consider doing that. I'll have to get a little more familiar with load balancers.

PickleRick · ‎02-19-2026

Unfortunately, that introduces another SPOF in the form of said LB.

syslog is a very simple solution (won't use the word "protocol" because "syslog" can mean many things) and was never meant to be very robust. Paraphrasing some well-known sayings - "R" in "syslog" stands for reliability.

PickleRick · ‎02-18-2026

Splunk on its own does not have any deduplication functionality. And two different sources are... well, just two independent sources so there is no built-in kind of input which would treat them as one.

You could try implementing your own modular/scripted input which would keep track of the state of each of the NFS-mounted files but that would mean you'd need to reimplement the monitor input with extra steps.

I'm also not sure what you mean by "two syslog servers to ingest data from RHEL node".

What is it you're trying to achieve here that cannot be achieved with - for example - useACK and sufficiently big permanent queue?

Avoiding double ingestion

index

Linux

syslog

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

From Data to Insight: Announcing the Winners of the Splunk Dashboard Contest

Splunk Developers: Construct Your Future at the .conf26 Builder Bar

Quick connection discovery mode for forwarders

Join the Conversation