I'm having a very hard time connecting my search head cluster to my search peer. I have stepped through the search head documentation very carefully located here: http://docs.splunk.com/Documentation/Splunk/6.5.0/DistSearch/SHCdeploymentoverview
I have successfully installed my deployer and added the
[shclustering] stanza to the /opt/splunk/etc/system/local/server.conf file and added the pass4SymmKey and shcluster_label.
I then ran
splunk init shcluster-config on each of my search head members and restarted Splunk. Each one ran successfully without any reported errors. I'm also able to run
splunk bootstrap shcluster-captain without any issues and
splunk show shcluster-status doesn't report any problems:
[splunk@lelsplunksh02 ~]$ splunk show shcluster-status Captain: dynamic_captain : 1 elected_captain : Thu Oct 13 15:48:05 2016 id : C2403815-55A2-413E-AF26-4998CFD9508F initialized_flag : 1 label : lelsplunksh03 maintenance_mode : 0 mgmt_uri : https://splunkserver:8089 min_peers_joined_flag : 1 rolling_restart_flag : 0 service_ready_flag : 1 Members: lelsplunksh02 label : lelsplunksh02 mgmt_uri : https://splunkserver:8089 mgmt_uri_alias : https://xx.xxx.xx.xxx:8089 status : Up lelsplunksh04 label : lelsplunksh04 mgmt_uri : https://splunkserver:8089 mgmt_uri_alias : https://xx.xxx.xx.xxx:8089 status : Up lelsplunksh03 label : lelsplunksh03 mgmt_uri : https://splunkserver:8089 mgmt_uri_alias : https://xx.xxx.xx.xxx:8089 status : Up
My problem starts when I try to add my search peer. I only have one indexer and I'm following this doc: http://docs.splunk.com/Documentation/Splunk/6.5.0/DistSearch/Connectclustersearchheadstosearchpeers
splunk add search-server https://splunkserver:8089 -auth admin:pswd -remoteUsername admin -remotePassword pswd
This also runs successfully, but I'm just not getting any results when I connect to my search head and run a search. I can run the exact same search on the indexer itself and it returns results. I can't see any errors in logs on either the indexer or the search head members.
Any help would be appreciated to point me in the right direction.
Hi there zipmaster,
first of I had my own problems with searchhead <-> indexer connection. It's easy to make a mistake here.
After the execution of the command there should be a distsearch.conf in the $SplunkHome/etc/system/local
Could you tell me if there is one?
If yes, could you maybe post it's content here, too?
(PS: in approx. 1.5 h I'm at work, so I will post you parts of my own guide where I tried to setup distributed search)
If you would like to delete existing cluster config on a search head (to beginn from start) do the following:
On every SH do the following commands:
splunk remove shcluster-member
(wait approx. 1 minute)
splunk clean all
Now you should have clean SH's without cluster config.
Initiate SH-Cluster config:
Go on every SH in server.conf and post the following (alter the config for every sh)
restart splunk afterwards
Initialize Cluster-Captain with this command:
splunk bootstrap shcluster-captain -servers_list "https://sh1:8089,https://sh2:8089,https://sh3:8089"
(it takes a while)
splunk show shcluster-status
Go on every Search-Head and create a Stanza called [clustering] in server.conf:
mode = searchhead
pass4SymmKey = e.g.:splunkisawesome
Try it out! Sometime this does the trick already.
If not... and I don't know why this only happens occasionaly do these steps as well:
Now you need to setup authentication for the Indexers:
Copy via scp (or other) every "trusted.pem" from every SH:
to the indexers into the corresponding file:
(if those directories arent there create them)
HOPE THIS HELPS 😉
Just ask, if you have any further questions.
Yes, there is a distsearch.conf in the $SplunkHome/etc/system/local/ directory on each search head:
[splunk@lelsplunksh02 ~]$ cat ~/etc/system/local/distsearch.conf [distributedSearch] servers = https://lelsplunkix01.lehi.micron.com:8089
They are exactly the same on each search head. I'm going to try your next steps to see if that fixes the problem.
I think i'm getting closer. I've gone through the above steps and I'm getting an error now when I connect to the web interface of my search head cluster:
The searchhead is unable to update the peer information. Error = 'Unable to reach master'.
And I can see this in the log on the search head captain:
10-14-2016 09:24:21.121 -0600 ERROR ClusteringMgr - VerifyMultisiteConfig failed Error=failed method=GET path=/services/cluster/master/info/?output_mode=json master=? rv=0 actual_response_code=502 expected_response_code=200 status_line="Error resolving: Name or service not known" socket_error="Cannot resolve hostname"
I've switch to using all IP addresses, so I'm not sure what hostname it cannot resolve. Here is what one of my server.conf files is currently looking like at the moment:
[splunk@lelsplunksh03 ~]$ cat ~/etc/system/local/server.conf [general] serverName = lelsplunksh03 pass4SymmKey = $1$lwa1+e7fvdG8 [sslConfig] sslKeysfilePassword = $1$wbrhsiub9Bgw [lmpool:auto_generated_pool_download-trial] description = auto_generated_pool_download-trial quota = MAX slaves = * stack_id = download-trial [lmpool:auto_generated_pool_forwarder] description = auto_generated_pool_forwarder quota = MAX slaves = * stack_id = forwarder [lmpool:auto_generated_pool_free] description = auto_generated_pool_free quota = MAX slaves = * stack_id = free [replication_port://34567] [shclustering] conf_deploy_fetch_url = https://10.192.88.27:8089 disabled = 0 mgmt_uri = https://lelsplunksh03.lehi.micron.com:8089 pass4SymmKey = $1$m5u8/t4toaFEoGHz shcluster_label = leshcluster01 id = 3C3740AF-9647-442D-BB08-9AE318070A85 [clustering] search_server = https://10.192.88.157:8089 mode = searchhead pass4SymmKey = $1$mfu4/tS1GkFIot0z
And I've changed my distsearch.conf file too:
[splunk@lelsplunksh03 ~]$ cat ~/etc/system/local/distsearch.conf [distributedSearch] servers = https://10.192.88.157:8089
I think I know why my instruction isn't 100% correct on your case.
I had a deployment of 3 SH's (clustered) joining 2 Indexers (clustered).
Besides that I had a server with "Master" functionality for the indexer-cluster
and a "Deployer" for the search-head cluster.
Hope you don't get me wrong on this, but you do have configured a deployer for your Searchhead-Cluster before right?
confdeployfetch_url = https://10.192.88.27:8089 (this should be the deployer)
I think you have so lets continue:
[clustering] search_server= https://indexer:8089 mode = searchhead pass4SymmKey = e.g.:splunkisawesome
This part of my answer I'm not sure about if it is applicable to your case.
Maybe try again the CLI command:
splunk add search-server <scheme>://<host>:<port> -auth <user>:<password> -remoteUsername <user> -remotePassword <passremote>
Yes, I have a dedicated deployer for my search head cluster.
I did already remove the [clustering] stanza and that error when away. I'm not doing any clustering on the indexer, I actually only have one at the moment. I did just do that splunk add search-server command on all three search head members and now I'm back to the original problem: I get no results when I search.
I'm spinning up a test indexer right now and I'm going to switch my search head cluster over to that to see if I get the same problem. If I don't then I know it has something to do with my original indexer.
Hmmmm "sh#t" this sound really wierd 😞
I was looking into the config files today, tried to figure out why its more difficult to connect non-cluster to cluster than two clusterd environments and I find it kind of sad, that the splunk docs are in that regard very uninformative about this task.
Hope the test-indexer brings in any new insights! I would love to hear about your solution, pls keep me (us) updated!
Well this is embarrassing, looks like the search head members were connected to the indexer and were able to get data back, but the query I was using to test was not working.
When I run the below search, from the web gui, from the search & reporting app, I get nothing on the search head cluster:
But, if I run this then I do get results back:
Why would my cluster refuse to return data when I have one parameter in the search? On top of that, when I run the same search (host=lelsyslog*) directly on the indexer, it does return data.
What tipped me off to this was when I setup the test indexer. I was getting the exact same problem. I setup two quick VM's; one as an indexer, the other to put a forwarder on. After I setup the forwarder and created a simple app to just grab /var/log/messages, I wasn't seeing any data. I thought it might have been because I didn't set something up right, but on a whim, I searched off of index=* and I got results back. I then searched off host and got nothing back, even though that exact host was in the results when I searched off index.
Can someone tell me how to fix this, it almost seems like a configuration issue.