I am running a cluster of 3 nodes - indexers, with one among 3 as master. When i log on the Clustering dashboard of both the peer nodes, i cannot see the peer dashboard as decribed in the document below. What i can see is the "Configure Clustering" page with the setting that i have configured. When i click save again it asks me to restart the splunk instance. I did it, but when I log back again, I will come back to the same page "Configure Clustering".
The master node always have this message - "Received an empty peer list from the master. Waiting for peers to join the cluster.
I can also see the following entries in the slave splunkd.log repeatedly. Any ideas ?
01-21-2013 13:56:14.059 +1100 INFO CMSlave - event=addPeer status=success shutdown=false pi: replication_address= forwarder_address= search_address= mgmtPort=8089 rawPort=8088 useSSL=false forwarderPort=0 forwarderPortUseSSL=true serverName=splunk10 activeBundleId=2e6a2af2087a2f7e6d70800c61a54537 status=Up type=Initial-Add baseGen=1 01-21-2013 13:56:14.063 +1100 INFO CMSlave - indexing ready loop status=failure shutdown=false proxyconnected=false 01-21-2013 13:56:14.063 +1100 INFO CMSlave - event=addPeer retrying because master went down just after add peer success 01-21-2013 13:56:14.110 +1100 INFO CMConfig - A splunktcp forwarder port is not configured in inputs.conf
Can anyone help me please ?
Thanks in Advance.
It looks like the 2 peers did register with the master. When you configured the master, did you change it's replication factor to be 2? (Check in etc/system/local/server.conf on the master under [clustering] stanza.) By default the master uses 3 and the first time it comes up, it waits for replication_factor peers before it commits the first generation.
Also, from the messages, it looks like you might be on the 5.0 build. 5.0.2 is the latest and you might want to just start with that.
I am having the same issue. After the cluster is up for 2 minutes, the splunk instance goes nuts and you basically have to shut it down and restart it. You get about two minutes to trouble shoot this.