Installation

What is "BatchAdding" status when upgrading?

Contributor

Hello All

While upgrading to version 6.6.2 (Indexer Cluster), I noticed that there is a new Status showing like "BatchAdding".

Though this is not much impacting anything, the Splunk upgrade was successful.

Any idea what this means?

alt text

Labels (2)
1 Solution

Contributor

Okay, from my experiences with dealing with this, this basically triggers when the communication between the CM and the IDXs drop off or are not in a sustainable way.

What actually happens is the CM thinks the IDX (Batch Adding one) cannot perform indexing operations hence it starts to move the IDX off the CM.

We all know that when an IDX drops off the CM, its primary buckets gets distributed to the other IDXs on site. By the time this process initiates/completes, the IDX gets back in a healthy communication position with the CM and hence the Batch Adding triggers.

What i did after this scenario was :

  1. Raised the Specs of the CM as we were under spec
  2. Increased the below specs in server.conf

[clustering] [Cluster Master]
service_interval = 10
heartbeat_timeout = 1800
cxn_timeout = 300
send_timeout = 300
rcv_timeout = 300
max_peer_build_load = 5

(using cluster-bundle deployment, no restart is required)
----server.conf-------- [Indexer]
On the Indexers:
cxn_timeout = 600
send_timeout = 600
rcv_timeout = 600
heartbeat_period = 10

Doing this reduced the instance from this occurring and this has stopped completely.

View solution in original post

Contributor

Okay, from my experiences with dealing with this, this basically triggers when the communication between the CM and the IDXs drop off or are not in a sustainable way.

What actually happens is the CM thinks the IDX (Batch Adding one) cannot perform indexing operations hence it starts to move the IDX off the CM.

We all know that when an IDX drops off the CM, its primary buckets gets distributed to the other IDXs on site. By the time this process initiates/completes, the IDX gets back in a healthy communication position with the CM and hence the Batch Adding triggers.

What i did after this scenario was :

  1. Raised the Specs of the CM as we were under spec
  2. Increased the below specs in server.conf

[clustering] [Cluster Master]
service_interval = 10
heartbeat_timeout = 1800
cxn_timeout = 300
send_timeout = 300
rcv_timeout = 300
max_peer_build_load = 5

(using cluster-bundle deployment, no restart is required)
----server.conf-------- [Indexer]
On the Indexers:
cxn_timeout = 600
send_timeout = 600
rcv_timeout = 600
heartbeat_period = 10

Doing this reduced the instance from this occurring and this has stopped completely.

View solution in original post

Splunk Employee
Splunk Employee

Thanks for this! Helped us out in our very large environment

0 Karma

SplunkTrust
SplunkTrust

Do you also have a very large indexer environment? How many indexers are working with the 1 cluster master?
I'm assuming it's also a lot of GB/day of data?

0 Karma

New Member

This fixed our issue. We had 10 indexers which started a loop of BatchAdding over and over.
Also, see slide 32 out of 34 here:
https://conf.splunk.com/files/2017/slides/scaling-indexer-clustering-5-million-unique-buckets-and-be...

0 Karma

Contributor

Yes, we have around 28 IDXs with 2.5 TB Data getting in everyday @garethatiag

SplunkTrust
SplunkTrust

I believe it was mentioned in one of the 2017 conf presentations, in particular https://conf.splunk.com/files/2017/slides/scaling-indexer-clustering-5-million-unique-buckets-and-be... however I cannot find the word "batch" in there.

BatchAdding is a state where the indexer is been added to the cluster, to improve performance of the cluster master (note this is from memory as I cannot find the details) and to / prevent the cluster master from hanging during adding an indexing peer with many buckets.

The slide "Peer adding - configurable amount of buckets" and the next few slides talk about the batch adding but I cannot find documentation on this which also surprised me...

0 Karma

Path Finder

i am seeing this as well

Failed to register with cluster master reason: failed method=POST path=/services/cluster/master/peers/?output_mode=json master=:8089 rv=0 gotConnectionError=1 gotUnexpectedStatusCode=0 actual_response_code=502 expected_response_code=2xx status_line="Read Timeout" socket_error="Read Timeout" remote_error= [ event=addPeer status=retrying AddPeerRequest: { _id= active_bundle_id=3A3876F4FD9A2DB7BBEDB12F45BDF49A add_type=ReAdd-As-Is base_generation_id=289956 batch_serialno=1 batch_size=17 forwarderdata_rcv_port=9997 forwarderdata_use_ssl=0 last_complete_generation_id=0 latest_bundle_id=3A3876F4FD9A2DB7BBEDB12F45BDF49A mgmt_port=8089 name=15A5C813-2915-4902-92D2-65C7095A9027 register_forwarder_address=10.18.193.51 register_replication_address=10.18.196.51 register_search_address=10.18.193.51 replication_port=9887 replication_use_ssl=0 replications= server_name=snx2splidxa23 site=site2 splunk_version=7.0.3 splunkd_build_number=fa31da744b51 status=Up } ].

0 Karma

Path Finder

we have around 120 indexers ..sometimes all are showing batch adding some times all are showing up
becuase of this results are not coming properly
and in the search head i am seeing the above error ..
please help

0 Karma

SplunkTrust
SplunkTrust

It almost sounds as if your cluster master is not keeping up with the current load of handling the many indexers, however that is a guess based on a log entry.

However I'd suggest that you would be better served by either a new question or perhaps in this case a support ticket!

0 Karma

Path Finder

hey .. thanks for help
i have raised the splunk case ..lets c what they suggest.

0 Karma

Contributor

@splunk24

Please ensure that your CM is well spec. 120 Indexers are huge and they need to be consistenly managed by CM. Ensure the CM is heavily specced and is sustaining connections to/from IDXs.

0 Karma

Ultra Champion

Nothing comes up with Google for BatchAdding - strange...

0 Karma

Path Finder

did you get the answer? i am also seeing the same

0 Karma