Splunk Search
Highlighted

Why is clustered indexers duplicating non-load balanced data?

Contributor

When I run a simple query "index=syslog update sourcetype=fgtevent devname=xxxxx", it returns duplicate (2) events with the only difference being the splunkserver field. The device is sending syslog data to only one of the indexers. I am using the standard UDP:514 Data Input to receive this data.

Splunk setup
2 server indexing cluster
2 non-clustered search heads.

Question 1 - Is this affecting my licence quota? Syslog data my largest source.
Question 2 - How do I clean this up as it is affecting reporting?

Thank you in advance for any help provided.

0 Karma
Highlighted

Re: Why is clustered indexers duplicating non-load balanced data?

Motivator

Can you post your 2 events??

0 Karma
Highlighted

Re: Why is clustered indexers duplicating non-load balanced data?

Contributor

As there are multiple events returned with each search, what are you wanting me to post?

When the search completes and you look at the events tab, all fields are the same except the splunk_server field which has a name of one of the two indexers.

I tried to verify that the data is being indexed only once by using the following search -

sourcetype=fgtevent | eval dupfield=raw | transaction dupfield maxspan=1s keepevicted=true | where mvcount(sourcetype) > 1

There were no duplicated raw values.

0 Karma
Highlighted

Re: Why is clustered indexers duplicating non-load balanced data?

Contributor

It turns out the the firewall that should be sending data to only one indexer was actually configured to send data to both indexers in the cluster. Splunk was performing as expected.

0 Karma
Highlighted

Re: Why is clustered indexers duplicating non-load balanced data?

Splunk Employee
Splunk Employee

Can you double check the values for source and sourcetype? If they are truly duplicated events, I would think that your search head is not configured correctly.

http://docs.splunk.com/Documentation/Splunk/6.4.2/Indexer/Aboutclusters

More specifically - http://docs.splunk.com/Documentation/Splunk/6.4.2/Indexer/Configurethesearchhead

View solution in original post

0 Karma
Highlighted

Re: Why is clustered indexers duplicating non-load balanced data?

Contributor

It turns out the the firewall that should be sending data to only one indexer was actually configured to send data to both indexers in the cluster. Splunk was performing as expected.

0 Karma