Deployment Architecture
Highlighted

why is the cluster master not able to fixup buckets (generation tab) "cannot fix up search factor as bucket is not serviceable"

Problem:
My cluster master is reporting fixup tasks under the bucket status , > Generation tab with status "cannot fix up search factor as bucket is not serviceable", however these buckets are never getting fixed.

0 Karma
Highlighted

Re: why is the cluster master not able to fixup buckets (generation tab) "cannot fix up search factor as bucket is not serviceable"

when the | delete command is issued in a search, data isn't actually deleted from disk but splunk creates a "deletes" directory and will not return those events in search.
ie:
on indexer:
$SPLUNKHOME/var/lib/splunk/defaultdb/db/db15469328481546891149154720CDA9-5F9B-4CE1-BB0D-10A6F555A1E4/rawdata/deletes
[root@indexer01 deletes]# zcat 38602ccf63e998fa1823f9f664055448.csv.gz
timestamp,event
address,typeid,hostid,sourceid,sourcetypeid
1546932848,1846,0,2,1,1
1546932848,1844,0,2,1,1
1546932848,1842,0,2,1,1
1546932848,1840,0,2,1,1
1546932848,1838,0,2,1,1
1546932848,1836,0,2,1,1
1546932848,1834,0,2,1,1

first the primary bucket will now have the "deletes" directory

All peers which hold this bucket need to have the "deletes" directory in sync

The peer holding the primary bucket will update its checksum and update the cluster master

subsequently, the peer will initiate a sync request (peer to peer) to update the other peers holding this bucket and this sync happens over port 8089 between peers

If port 8089 is not open between indexers the sync request will fail between peers and you will have buckets in this state where they are in a fixup loop and never complete the fixup.

We see this in the CM fixup in the generation tab which shows "cannot fix up search factor as bucket is not serviceable"

if you see a log msg on the indexer in splunkd.log like the one below , most likely port 8089 (splunk mgmt default port) is not open between indexers and it needs to be:

01-08-2019 16:15:57.292 -0800 ERROR CMRepJob - job=CMSyncP2PJob bid= myguid= myrawport=9887 myusessl=0 otguid= othp=10.10.10.1:8089 otrawport=9887 otusessl=0 relativepath= custact=p2p_syncup getHttpReply failed; err: Connect Timeout

Once that port is opened the fixup tasks should complete and get remove from the CM fixup activities

View solution in original post

Highlighted

Re: why is the cluster master not able to fixup buckets (generation tab) "cannot fix up search factor as bucket is not serviceable"

Motivator

Our doc explains management port (default 8089) is the required port opened between cluster peers. We always needed this port opened.
https://docs.splunk.com/Documentation/Splunk/latest/Indexer/Systemrequirements#Ports_that_the_cluste...

But, who reads doc all the time ? Wish Splunk checks connectivity of the required ports, and show warning message in Indexer Clustering page.

0 Karma
Highlighted

Re: why is the cluster master not able to fixup buckets (generation tab) "cannot fix up search factor as bucket is not serviceable"

Wish Splunk checks connectivity of the required ports, and show warning message in Indexer Clustering page.

@Masa enhancement SPL-164805 has been filed 🙂

0 Karma
Highlighted

Re: why is the cluster master not able to fixup buckets (generation tab) "cannot fix up search factor as bucket is not serviceable"

Motivator

you're awesome, @rphillips_splunk

0 Karma
Highlighted

Re: why is the cluster master not able to fixup buckets (generation tab) "cannot fix up search factor as bucket is not serviceable"

SplunkTrust
SplunkTrust

I’ve seen this before when frozen buckets were restored to just one of two indexers in their cluster.

Buckets in the thaweddb path are “not serviceable” because by placing them in thawed you’re telling splunk you don’t want them to be deleted. Splunk is also not going to replicate thawed buckets because that would be a mess. So then thawed buckets will also show as unserviceable.

I mention this because the solution for not serviceable thawed buckets would be different from the solution that worked above. In case someone comes with very similar issue but different situation.

Highlighted

Re: why is the cluster master not able to fixup buckets (generation tab) "cannot fix up search factor as bucket is not serviceable"

splunkd.log shows : ERROR CMRepJob - job=CMSyncP2PJob

0 Karma