We have two syslog standalone servers (both are active) (one named as primary and the other is contingency) with UF installed in it forwards data to Splunk. We have different indexes configured for these two servers.
Now the issue is same log is getting indexed into both servers which resulting in duplication of logs in Splunk.
Syslog 1 --- index = sony_a == Same log
Syslog 2 --- index = sony_b == Same log
When we search with index=sony* it is giving same logs for two indexes which is duplication.
how to avoid two syslog servers from getting indexed same log twice?
@gcusello can deploying load balancer between syslog servers help us to get rid of same log ingesting in 2 syslog servers?
Hi @splunklearner ,
no, the Load Balancer gives you the condition that you don't lose any logs even if one receiver is down, it's the first condition for HA, but it doesn't give any feature aboud duplicatibg logs.
The only solution is the one I described.
Ciao.
Giuseppe
It sounds like your duplication is coming before it hits Splunk - Its not easy to deduplicate this on the way through, instead you might want to look at how the data is sent to syslog.
Is the data being sent from the origin to both syslog servers at the same time? Is it possible to control this behaviour so it sends only to the primary, or to the standby if it fails?
Please let me know how you get on and consider accepting this answer or adding karma this answer if it has helped.
Regards
Will
Is the data being sent from the origin to both syslog servers at the same time? -- Yes, both syslog servers picking same log and ingesting at the same time.
Is it possible to control this behaviour so it sends only to the primary, or to the standby if it fails? --- How to achieve this?
Hi, I think ultimately this might depend on the source of the data, what are you sending to the syslog server?
F5 WAF logs
Hi @livehybrid ,
using only Splunk the only way is indexing all the logs and use dedup in your searches, but in this way you pay twice the license, because it isn't possible in Splunk to create a filter to avoid duplicates before indexing.
The only solution is to take logs using an rsyslog and writing logs in files, then preparse the logs using a script, but it's very heavy for the system.
Ciao.
Giuseppe
The only solution is to take logs using an rsyslog and writing logs in files, then preparse the logs using a script, but it's very heavy for the system. --> Can you please describe more about this and the script I need to use ?