Deployment Architecture
Highlighted

distsearch.conf is overridden after updating through GUI , upon restarting splunk

Contributor

We've SH Cluster environment and are seeing the following error ;

"Gave up waiting for the captain to establish a common bundle version across all search peers; using most recent bundles on all peers instead"

After some re-search and looking through answers site, this could be due to inconsistent distsearch.conf on some of the search heads in the cluster ; so I updated and removed all the values to servers key in distsearch.conf on all the search heads in the cluster and restarted splunk; but immediately following restart the changes made are overridden and restored to old distsearch.conf file. We're not deploying this file with these changes using deployer.

Following was done (multiple times) on each search head in the cluster (IPs hashed for security purposes) -

  1. cat /opt/splunk/etc/system/local/distsearch.conf
    [distributedSearch]
    servers = https://10.xxx.36.000:8089,https://10.xxx.46.00:8089,https://10.xxx.46.00:8089,https://10.xxx.46.00:...

  2. Changed distsearch.conf to

[distributedSearch]
servers =

  1. Restarted splunk
  2. Checked the distsearch.conf file to find contents restored

We even tried to delete the distsearch.conf file across all the search heads in the cluster , followed by restarting all the members, but the distsearch.conf file gets recreated.

output of btool command on distsearch from one of the affected search heads in the cluster. I have checked for any monitoring/CM tool, but we don't have any to manage splunk process.

[spnksvc@ep3vmnspk199 bin]$ ./splunk cmd btool distsearch list --debug
/opt/splunk/etc/system/default/distsearch.conf [bundleEnforcerBlacklist]
/opt/splunk/etc/system/default/distsearch.conf [bundleEnforcerWhitelist]
/opt/splunk/etc/apps/splunkdistconf/default/distsearch.conf [distributedSearch]
/opt/splunk/etc/system/default/distsearch.conf authTokenConnectionTimeout = 5
/opt/splunk/etc/system/default/distsearch.conf authTokenReceiveTimeout = 10
/opt/splunk/etc/system/default/distsearch.conf authTokenSendTimeout = 10
/opt/splunk/etc/system/default/distsearch.conf bestEffortSearch = false
/opt/splunk/etc/system/default/distsearch.conf connectionTimeout = 10
/opt/splunk/etc/system/default/distsearch.conf defaultUriScheme = https
/opt/splunk/etc/apps/splunkdistconf/default/distsearch.conf disabled = 0
/opt/splunk/etc/system/default/distsearch.conf receiveTimeout = 600
/opt/splunk/etc/system/default/distsearch.conf sendTimeout = 30
/opt/splunk/etc/apps/splunkdistconf/default/distsearch.conf serverTimeout = 900
/opt/splunk/etc/system/local/distsearch.conf servers = https://10.xxx.36.000:8089,https://10.xxx.46.00:8089,https://10.xxx.46.00:8089,https://10.xxx.46.00:...
/opt/splunk/etc/system/default/distsearch.conf shareBundles = true
/opt/splunk/etc/apps/splunkdistconf/default/distsearch.conf statusTimeout = 900
/opt/splunk/etc/system/default/distsearch.conf useSHPBundleReplication = true
/opt/splunk/etc/apps/SplunkTAwindows/default/distsearch.conf [replicationBlacklist]
/opt/splunk/etc/apps/splunkappwindowsinfrastructure/default/distsearch.conf MSADlookups = .../splunkappwindowsinfrastructure/lookups/(tHostInfo|tSessions).csv$
/opt/splunk/etc/system/default/distsearch.conf conf = (system|(apps/))/(default|local)/server.conf
/opt/splunk/etc/system/default/distsearch.conf framework = apps/framework/...
/opt/splunk/etc/system/default/distsearch.conf lookupindexfiles = (system|apps/
|users(/
reserved)?//)/lookups/.(tmp$|index($|/...))
/opt/splunk/etc/apps/splunkdistconf/default/distsearch.conf noBinDir = (.../bin/
)
/opt/splunk/etc/apps/SplunkTAwindows/default/distsearch.conf nontsyslogmappings = ...ntsyslogmappings.csv
/opt/splunk/etc/system/default/distsearch.conf sampleapp = apps/sample
app/...
/opt/splunk/etc/system/default/distsearch.conf userspecificmeta = users(/reserved)?///metadata/local.meta
/opt/splunk/etc/apps/splunk
distconf/default/distsearch.conf [replicationSettings]
/opt/splunk/etc/system/default/distsearch.conf allowDeltaUpload = true
/opt/splunk/etc/system/default/distsearch.conf allowSkipEncoding = true
/opt/splunk/etc/system/default/distsearch.conf allowStreamUpload = auto
/opt/splunk/etc/system/default/distsearch.conf concerningReplicatedFileSize = 500
/opt/splunk/etc/system/default/distsearch.conf connectionTimeout = 60
/opt/splunk/etc/system/default/distsearch.conf excludeReplicatedLookupSize = 0
/opt/splunk/etc/apps/splunk
distconf/default/distsearch.conf maxBundleSize = 14438892420
/opt/splunk/etc/system/default/distsearch.conf maxMemoryBundleSize = 10
/opt/splunk/etc/apps/splunk
distconf/default/distsearch.conf replicationThreads = 8
/opt/splunk/etc/system/default/distsearch.conf sanitizeMetaFiles = true
/opt/splunk/etc/system/default/distsearch.conf sendRcvTimeout = 60
/opt/splunk/etc/system/default/distsearch.conf [replicationSettings:refineConf]
/opt/splunk/etc/system/default/distsearch.conf replicate.app = true
/opt/splunk/etc/system/default/distsearch.conf replicate.authorize = true
/opt/splunk/etc/system/default/distsearch.conf replicate.collections = true
/opt/splunk/etc/system/default/distsearch.conf replicate.commands = true
/opt/splunk/etc/system/default/distsearch.conf replicate.eventtypes = true
/opt/splunk/etc/system/default/distsearch.conf replicate.fields = true
/opt/splunk/etc/system/default/distsearch.conf replicate.literals = true
/opt/splunk/etc/system/default/distsearch.conf replicate.lookups = true
/opt/splunk/etc/system/default/distsearch.conf replicate.multikv = true
/opt/splunk/etc/system/default/distsearch.conf replicate.props = true
/opt/splunk/etc/system/default/distsearch.conf replicate.segmenters = true
/opt/splunk/etc/system/default/distsearch.conf replicate.tags = true
/opt/splunk/etc/system/default/distsearch.conf replicate.transactiontypes = true
/opt/splunk/etc/system/default/distsearch.conf replicate.transforms = true
/opt/splunk/etc/system/default/distsearch.conf [replicationWhitelist]
/opt/splunk/etc/system/default/distsearch.conf kvstore = kvstore
/...
/opt/splunk/etc/system/default/distsearch.conf other = (system|(apps/(?!pdfserver)
)|users(/reserved)?//)/(bin|lookups)/...
/opt/splunk/etc/system/default/distsearch.conf refine.conf = (system|(apps/*)|users(/
reserved)?//)/(default|local)/.conf
/opt/splunk/etc/system/default/distsearch.conf refine.metadata = (system|(apps/
)|users(/reserved)?//)/metadata/*.meta
/opt/splunk/etc/system/default/distsearch.conf searchscripts = searchscripts/...
/opt/splunk/etc/system/default/distsearch.conf [tokenExchKeys]
/opt/splunk/etc/system/default/distsearch.conf certDir = $SPLUNK
HOME/etc/auth/distServerKeys
/opt/splunk/etc/system/default/distsearch.conf genKeyScript = $SPLUNK_HOME/bin/splunk, createssl, audit-keys
/opt/splunk/etc/system/default/distsearch.conf privateKey = private.pem
/opt/splunk/etc/system/default/distsearch.conf publicKey = trusted.pem

0 Karma
Highlighted

Re: distsearch.conf is overridden after updating through GUI , upon restarting splunk

Motivator

Try below approach.

  1. Modify config file in captain
  2. Run below command in Search-head members $SPLUNK_HOME/bin/splunk resync shcluster-replicated-config
  3. Then, run below command in captain to restart all Search-head members (including captain) $SPLUNK_HOME/bin/splunk rolling-restart shcluster-members
0 Karma
Highlighted

Re: distsearch.conf is overridden after updating through GUI , upon restarting splunk

Contributor

Thanks @jawaharas

I don't see the file on on the captain now . Should I create a file with contents on captain and then run step 2 and 3 ?

0 Karma
Highlighted

Re: distsearch.conf is overridden after updating through GUI , upon restarting splunk

Motivator

Yep. Go ahead.

0 Karma
Highlighted

Re: distsearch.conf is overridden after updating through GUI , upon restarting splunk

Contributor

just tried the approach .

  1. created a distsearch.conf file with following contents on the captain -

[distributedSearch]
servers =

  1. Ran $SPLUNK_HOME/bin/splunk resync shcluster-replicated-config

  2. Rolling restart of SH members

I checked couple of members where the restart was completed and found the distsearch.conf file got overridden again to old with contents.

[distributedSearch]
servers = https://10.xxx.36.000:8089,https://10.xxx.46.00:8089,https://10.xxx.46.00:8089,https://10.xxx.46.00:...

Update -

Found set of old search heads (including the captain) in the cluster got updated with the old distsearch.conf (overridden); we added 4 new search heads this week and they seem to be okay.

0 Karma
Highlighted

Re: distsearch.conf is overridden after updating through GUI , upon restarting splunk

Motivator

Did you run below command in search-head members (not in captain) and verify the config file content before restart?

$SPLUNK_HOME/bin/splunk resync shcluster-replicated-config

0 Karma
Highlighted

Re: distsearch.conf is overridden after updating through GUI , upon restarting splunk

Contributor

yes. Ran it across all SH members, except for captain , then verified the config file contents on all the members before restart; but still seeing the issue .

0 Karma
Highlighted

Re: distsearch.conf is overridden after updating through GUI , upon restarting splunk

SplunkTrust
SplunkTrust

Hi. What version of Splunk is this happening on?

0 Karma
Highlighted

Re: distsearch.conf is overridden after updating through GUI , upon restarting splunk

Contributor

@burwell - it's 7.1.1

0 Karma
Highlighted

Re: distsearch.conf is overridden after updating through GUI , upon restarting splunk

Motivator

I hope you are using clustered indexers.

Can you check whether the shclustering stanza '$SPLUNK_HOME/etc/system/local/server.conf' file is consistent across all search-head members?

Also, can you share 'shclustering' stanza content from your search-head's 'server.conf' (after masking sensitive data)?

0 Karma