We've SH Cluster environment and are seeing the following error ;
"Gave up waiting for the captain to establish a common bundle version across all search peers; using most recent bundles on all peers instead"
After some re-search and looking through answers site, this could be due to inconsistent distsearch.conf on some of the search heads in the cluster ; so I updated and removed all the values to servers key in distsearch.conf on all the search heads in the cluster and restarted splunk; but immediately following restart the changes made are overridden and restored to old distsearch.conf file. We're not deploying this file with these changes using deployer.
Following was done (multiple times) on each search head in the cluster (IPs hashed for security purposes) -
cat /opt/splunk/etc/system/local/distsearch.conf
[distributedSearch]
servers = https://10.xxx.36.000:8089,https://10.xxx.46.00:8089,https://10.xxx.46.00:8089,https://10.xxx.46.00:8089,https://eo1vmsk011.lema:8089
Changed distsearch.conf to
[distributedSearch]
servers =
Restarted splunk
Checked the distsearch.conf file to find contents restored
We even tried to delete the distsearch.conf file across all the search heads in the cluster , followed by restarting all the members, but the distsearch.conf file gets recreated.
output of btool command on distsearch from one of the affected search heads in the cluster. I have checked for any monitoring/CM tool, but we don't have any to manage splunk process.
[spnksvc@ep3vmnspk199 bin]$ ./splunk cmd btool distsearch list --debug
/opt/splunk/etc/system/default/distsearch.conf [bundleEnforcerBlacklist]
/opt/splunk/etc/system/default/distsearch.conf [bundleEnforcerWhitelist]
/opt/splunk/etc/apps/splunk_dist_conf/default/distsearch.conf [distributedSearch]
/opt/splunk/etc/system/default/distsearch.conf authTokenConnectionTimeout = 5
/opt/splunk/etc/system/default/distsearch.conf authTokenReceiveTimeout = 10
/opt/splunk/etc/system/default/distsearch.conf authTokenSendTimeout = 10
/opt/splunk/etc/system/default/distsearch.conf bestEffortSearch = false
/opt/splunk/etc/system/default/distsearch.conf connectionTimeout = 10
/opt/splunk/etc/system/default/distsearch.conf defaultUriScheme = https
/opt/splunk/etc/apps/splunk_dist_conf/default/distsearch.conf disabled = 0
/opt/splunk/etc/system/default/distsearch.conf receiveTimeout = 600
/opt/splunk/etc/system/default/distsearch.conf sendTimeout = 30
/opt/splunk/etc/apps/splunk_dist_conf/default/distsearch.conf serverTimeout = 900
/opt/splunk/etc/system/local/distsearch.conf servers = https://10.xxx.36.000:8089,https://10.xxx.46.00:8089,https://10.xxx.46.00:8089,https://10.xxx.46.00:8089,https://eo1vmsk011.lema:8089
/opt/splunk/etc/system/default/distsearch.conf shareBundles = true
/opt/splunk/etc/apps/splunk_dist_conf/default/distsearch.conf statusTimeout = 900
/opt/splunk/etc/system/default/distsearch.conf useSHPBundleReplication = true
/opt/splunk/etc/apps/Splunk_TA_windows/default/distsearch.conf [replicationBlacklist]
/opt/splunk/etc/apps/splunk_app_windows_infrastructure/default/distsearch.conf MSAD_lookups = .../splunk_app_windows_infrastructure/lookups/(tHostInfo|tSessions).csv$
/opt/splunk/etc/system/default/distsearch.conf conf = (system|(apps/))/(default|local)/server.conf
/opt/splunk/etc/system/default/distsearch.conf framework = apps/framework/...
/opt/splunk/etc/system/default/distsearch.conf lookupindexfiles = (system|apps/|users(/reserved)?//)/lookups/.(tmp$|index($|/...))
/opt/splunk/etc/apps/splunk_dist_conf/default/distsearch.conf noBinDir = (.../bin/)
/opt/splunk/etc/apps/Splunk_TA_windows/default/distsearch.conf nontsyslogmappings = ...ntsyslog_mappings.csv
/opt/splunk/etc/system/default/distsearch.conf sampleapp = apps/sample_app/...
/opt/splunk/etc/system/default/distsearch.conf user_specific_meta = users(/_reserved)?///metadata/local.meta
/opt/splunk/etc/apps/splunk_dist_conf/default/distsearch.conf [replicationSettings]
/opt/splunk/etc/system/default/distsearch.conf allowDeltaUpload = true
/opt/splunk/etc/system/default/distsearch.conf allowSkipEncoding = true
/opt/splunk/etc/system/default/distsearch.conf allowStreamUpload = auto
/opt/splunk/etc/system/default/distsearch.conf concerningReplicatedFileSize = 500
/opt/splunk/etc/system/default/distsearch.conf connectionTimeout = 60
/opt/splunk/etc/system/default/distsearch.conf excludeReplicatedLookupSize = 0
/opt/splunk/etc/apps/splunk_dist_conf/default/distsearch.conf maxBundleSize = 14438892420
/opt/splunk/etc/system/default/distsearch.conf maxMemoryBundleSize = 10
/opt/splunk/etc/apps/splunk_dist_conf/default/distsearch.conf replicationThreads = 8
/opt/splunk/etc/system/default/distsearch.conf sanitizeMetaFiles = true
/opt/splunk/etc/system/default/distsearch.conf sendRcvTimeout = 60
/opt/splunk/etc/system/default/distsearch.conf [replicationSettings:refineConf]
/opt/splunk/etc/system/default/distsearch.conf replicate.app = true
/opt/splunk/etc/system/default/distsearch.conf replicate.authorize = true
/opt/splunk/etc/system/default/distsearch.conf replicate.collections = true
/opt/splunk/etc/system/default/distsearch.conf replicate.commands = true
/opt/splunk/etc/system/default/distsearch.conf replicate.eventtypes = true
/opt/splunk/etc/system/default/distsearch.conf replicate.fields = true
/opt/splunk/etc/system/default/distsearch.conf replicate.literals = true
/opt/splunk/etc/system/default/distsearch.conf replicate.lookups = true
/opt/splunk/etc/system/default/distsearch.conf replicate.multikv = true
/opt/splunk/etc/system/default/distsearch.conf replicate.props = true
/opt/splunk/etc/system/default/distsearch.conf replicate.segmenters = true
/opt/splunk/etc/system/default/distsearch.conf replicate.tags = true
/opt/splunk/etc/system/default/distsearch.conf replicate.transactiontypes = true
/opt/splunk/etc/system/default/distsearch.conf replicate.transforms = true
/opt/splunk/etc/system/default/distsearch.conf [replicationWhitelist]
/opt/splunk/etc/system/default/distsearch.conf kvstore = kvstore/...
/opt/splunk/etc/system/default/distsearch.conf other = (system|(apps/(?!pdfserver))|users(/_reserved)?//)/(bin|lookups)/...
/opt/splunk/etc/system/default/distsearch.conf refine.conf = (system|(apps/)|users(/_reserved)?//)/(default|local)/.conf
/opt/splunk/etc/system/default/distsearch.conf refine.metadata = (system|(apps/)|users(/_reserved)?//)/metadata/.meta
/opt/splunk/etc/system/default/distsearch.conf searchscripts = searchscripts/...
/opt/splunk/etc/system/default/distsearch.conf [tokenExchKeys]
/opt/splunk/etc/system/default/distsearch.conf certDir = $SPLUNK_HOME/etc/auth/distServerKeys
/opt/splunk/etc/system/default/distsearch.conf genKeyScript = $SPLUNK_HOME/bin/splunk, createssl, audit-keys
/opt/splunk/etc/system/default/distsearch.conf privateKey = private.pem
/opt/splunk/etc/system/default/distsearch.conf publicKey = trusted.pem
... View more