Getting Data In

Why does clustering always appear as a repeat phenomenon without a reason?

xsstest
Communicator

hello, I have a strange question, This question is described as a bit rough.
I have a single site cluster that contains 5 indexers, 4 search heads, a deploye, a cluster master, some deployment servers, some heavy forwarders, and some universal forwarders. The deployment server also acts as the role of a heavy forwarder.
The search factor of indexer clustering is 2 and replication factor is 3. Universal forwarder monitor log files then forward to HF, then hf forward it to indexers cluster.
Strange things always happen unreasonably. When the cluster is running for a period of time, some sourcetype event will be duplicated, Sometimes, each event is repeated 5 times. if I restart heavy forwarders. The repetition of the phenomenon will disappear. The whole cluster will return to normal but sometimes I need to restart their universal forwarder for it to work.
Some soucetype events have been duplicate again and I will need to restart HF OR UF to return to normal state.
I tried to find out the reason from the indexer's splunkd.log, but I didn't find any clues.
I think index replication has a problem but I couldn't find any error logs. Why does it return to normal when I restart HF or UF?

0 Karma

richgalloway
SplunkTrust
SplunkTrust

In addition to @woodcock's great answer, you should avoid the intermediate HF if you don't need it for a specific purpose. UFs distribute events among indexers better than an HF. Also, the HF can actually make the indexers work harder to process events.

If you eliminate the HF, be sure to set useAck=true on the UFs.

---
If this reply helps you, Karma would be appreciated.
0 Karma

mdsnmss
SplunkTrust
SplunkTrust

I've seen something similar before and for us it seemed to be due to a misbehaving indexer. When you search the events and see duplicates, what does the splunk_server field show? The splunk_server field will show you which indexer the search is pulling the event from. In our case each duplicate showed one server that every event had in common, while other indexers where distributed across each duplicate. We identified the problem indexer and took it out of the cluster and it resolved. Since it is single site you shouldn't have an issue with search affinity.

0 Karma

felipesewaybric
Contributor

Can you send the output.conf and the input.conf?

0 Karma

woodcock
Esteemed Legend

For UF->HF set useAck to false but for HF/UF->IDX set useAck to true. Also be sure to use EVENT_BREAKER everywhere.

hurricanelabs
Path Finder

What version of splunk are you running?

What search heads are listed in the Cluster Master? (It should just be the search heads and the cluster master, not any of the other stuff)

What does your outputs.conf look like on the HF?

0 Karma

vidhyaArumalla
Path Finder

I am facing a similar issue on the cloud architecture, but on-prem architecture so far did not have the issue mentioned above.
I am skeptical if that has to do something with timezones.

0 Karma
Get Updates on the Splunk Community!

How to Monitor Google Kubernetes Engine (GKE)

We’ve looked at how to integrate Kubernetes environments with Splunk Observability Cloud, but what about ...

Index This | How can you make 45 using only 4?

October 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with this ...

Splunk Education Goes to Washington | Splunk GovSummit 2024

If you’re in the Washington, D.C. area, this is your opportunity to take your career and Splunk skills to the ...