I'm having a very hard time connecting my search head cluster to my search peer. I have stepped through the search head documentation very carefully located here: http://docs.splunk.com/Documentation/Splunk/6.5.0/DistSearch/SHCdeploymentoverview
I have successfully installed my deployer and added the [shclustering]
stanza to the /opt/splunk/etc/system/local/server.conf file and added the pass4SymmKey and shcluster_label.
I then ran splunk init shcluster-config
on each of my search head members and restarted Splunk. Each one ran successfully without any reported errors. I'm also able to run splunk bootstrap shcluster-captain
without any issues and splunk show shcluster-status
doesn't report any problems:
[splunk@lelsplunksh02 ~]$ splunk show shcluster-status
Captain:
dynamic_captain : 1
elected_captain : Thu Oct 13 15:48:05 2016
id : C2403815-55A2-413E-AF26-4998CFD9508F
initialized_flag : 1
label : lelsplunksh03
maintenance_mode : 0
mgmt_uri : https://splunkserver:8089
min_peers_joined_flag : 1
rolling_restart_flag : 0
service_ready_flag : 1
Members:
lelsplunksh02
label : lelsplunksh02
mgmt_uri : https://splunkserver:8089
mgmt_uri_alias : https://xx.xxx.xx.xxx:8089
status : Up
lelsplunksh04
label : lelsplunksh04
mgmt_uri : https://splunkserver:8089
mgmt_uri_alias : https://xx.xxx.xx.xxx:8089
status : Up
lelsplunksh03
label : lelsplunksh03
mgmt_uri : https://splunkserver:8089
mgmt_uri_alias : https://xx.xxx.xx.xxx:8089
status : Up
My problem starts when I try to add my search peer. I only have one indexer and I'm following this doc: http://docs.splunk.com/Documentation/Splunk/6.5.0/DistSearch/Connectclustersearchheadstosearchpeers
I'm running:
splunk add search-server https://splunkserver:8089 -auth admin:pswd -remoteUsername admin -remotePassword pswd
This also runs successfully, but I'm just not getting any results when I connect to my search head and run a search. I can run the exact same search on the indexer itself and it returns results. I can't see any errors in logs on either the indexer or the search head members.
Any help would be appreciated to point me in the right direction.
Hi there zipmaster,
first of I had my own problems with searchhead <-> indexer connection. It's easy to make a mistake here.
After the execution of the command there should be a distsearch.conf in the $SplunkHome/etc/system/local
Could you tell me if there is one?
If yes, could you maybe post it's content here, too?
Thanks!
pyro_wood
(PS: in approx. 1.5 h I'm at work, so I will post you parts of my own guide where I tried to setup distributed search)
EDIT:
If you would like to delete existing cluster config on a search head (to beginn from start) do the following:
On every SH do the following commands:
splunk remove shcluster-member
(wait approx. 1 minute)
splunk stop
splunk clean all
splunk start
Now you should have clean SH's without cluster config.
Initiate SH-Cluster config:
Go on every SH in server.conf and post the following (alter the config for every sh)
[shclustering]
conf_deploy_fetch_url = https://deployer:8089
disabled = 0
mgmt_uri = https://sh1:8089
pass4SymmKey = e.g.:splunkisawesome
shcluster_label = e.g.:SH-Cluster_1
restart splunk afterwards
restart splunk
Initialize Cluster-Captain with this command:
splunk bootstrap shcluster-captain -servers_list "https://sh1:8089,https://sh2:8089,https://sh3:8089"
(it takes a while)
then do:
splunk show shcluster-status
Next steps:
Go on every Search-Head and create a Stanza called [clustering] in server.conf:
[clustering]
search_server= https://indexer:8089
mode = searchhead
pass4SymmKey = e.g.:splunkisawesome
Then execute:
restart splunk
Try it out! Sometime this does the trick already.
If not... and I don't know why this only happens occasionaly do these steps as well:
Now you need to setup authentication for the Indexers:
Copy via scp (or other) every "trusted.pem" from every SH:
/opt/splunk/etc/auth/distServerKeys/trusted.pem
to the indexers into the corresponding file:
$SplunkHome/etc/auth/distServerKeys//
$SplunkHome/etc/auth/distServerKeys//
$SplunkHome/etc/auth/distServerKeys//
(if those directories arent there create them)
Restart indexer
splunk restart
HOPE THIS HELPS 😉
Just ask, if you have any further questions.
I downvoted this post because the problem that the user had was not with establishing connectivity between search heads and search peers. rather, the problem was with the query used to test connectivity. the accepted answer provided a procedure tht, if followed by others, would cause them to completely rebuild their search head cluster, among other issues.
this answer should be removed, as it could easily mislead anyone who does have search peer connectivity issues.
You might want to double check the management port on your indexer, to make sure that it's 8089.
See http://docs.splunk.com/Documentation/Splunk/6.5.0/Admin/Changedefaultvalues#Use_Splunk_Web_2
Yes, the indexer is listening on 8089. This is my production indexer:
[splunk@lelsplunkix01 ~]$ netstat -tupan | grep 8089
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
tcp 0 0 0.0.0.0:8089 0.0.0.0:* LISTEN 4015/splunkd
tcp 0 0 10.192.88.157:8089 10.192.88.149:9393 ESTABLISHED 4015/splunkd
tcp 0 0 10.192.88.157:8089 10.192.88.105:13056 ESTABLISHED 4015/splunkd
tcp 0 0 10.192.88.157:46334 10.192.88.156:8089 ESTABLISHED 4015/splunkd
I can telnet to that server on that port as well from my search head members.
Hi there zipmaster,
first of I had my own problems with searchhead <-> indexer connection. It's easy to make a mistake here.
After the execution of the command there should be a distsearch.conf in the $SplunkHome/etc/system/local
Could you tell me if there is one?
If yes, could you maybe post it's content here, too?
Thanks!
pyro_wood
(PS: in approx. 1.5 h I'm at work, so I will post you parts of my own guide where I tried to setup distributed search)
EDIT:
If you would like to delete existing cluster config on a search head (to beginn from start) do the following:
On every SH do the following commands:
splunk remove shcluster-member
(wait approx. 1 minute)
splunk stop
splunk clean all
splunk start
Now you should have clean SH's without cluster config.
Initiate SH-Cluster config:
Go on every SH in server.conf and post the following (alter the config for every sh)
[shclustering]
conf_deploy_fetch_url = https://deployer:8089
disabled = 0
mgmt_uri = https://sh1:8089
pass4SymmKey = e.g.:splunkisawesome
shcluster_label = e.g.:SH-Cluster_1
restart splunk afterwards
restart splunk
Initialize Cluster-Captain with this command:
splunk bootstrap shcluster-captain -servers_list "https://sh1:8089,https://sh2:8089,https://sh3:8089"
(it takes a while)
then do:
splunk show shcluster-status
Next steps:
Go on every Search-Head and create a Stanza called [clustering] in server.conf:
[clustering]
search_server= https://indexer:8089
mode = searchhead
pass4SymmKey = e.g.:splunkisawesome
Then execute:
restart splunk
Try it out! Sometime this does the trick already.
If not... and I don't know why this only happens occasionaly do these steps as well:
Now you need to setup authentication for the Indexers:
Copy via scp (or other) every "trusted.pem" from every SH:
/opt/splunk/etc/auth/distServerKeys/trusted.pem
to the indexers into the corresponding file:
$SplunkHome/etc/auth/distServerKeys//
$SplunkHome/etc/auth/distServerKeys//
$SplunkHome/etc/auth/distServerKeys//
(if those directories arent there create them)
Restart indexer
splunk restart
HOPE THIS HELPS 😉
Just ask, if you have any further questions.
Maybe also have a look here
http://docs.splunk.com/Documentation/Splunk/6.5.0/DistSearch/Configuredistributedsearch#Edit_distsea...
I think i'm getting closer. I've gone through the above steps and I'm getting an error now when I connect to the web interface of my search head cluster:
The searchhead is unable to update the peer information. Error = 'Unable to reach master'.
And I can see this in the log on the search head captain:
10-14-2016 09:24:21.121 -0600 ERROR ClusteringMgr - VerifyMultisiteConfig failed Error=failed method=GET path=/services/cluster/master/info/?output_mode=json master=? rv=0 actual_response_code=502 expected_response_code=200 status_line="Error resolving: Name or service not known" socket_error="Cannot resolve hostname"
I've switch to using all IP addresses, so I'm not sure what hostname it cannot resolve. Here is what one of my server.conf files is currently looking like at the moment:
[splunk@lelsplunksh03 ~]$ cat ~/etc/system/local/server.conf
[general]
serverName = lelsplunksh03
pass4SymmKey = $1$lwa1+e7fvdG8
[sslConfig]
sslKeysfilePassword = $1$wbrhsiub9Bgw
[lmpool:auto_generated_pool_download-trial]
description = auto_generated_pool_download-trial
quota = MAX
slaves = *
stack_id = download-trial
[lmpool:auto_generated_pool_forwarder]
description = auto_generated_pool_forwarder
quota = MAX
slaves = *
stack_id = forwarder
[lmpool:auto_generated_pool_free]
description = auto_generated_pool_free
quota = MAX
slaves = *
stack_id = free
[replication_port://34567]
[shclustering]
conf_deploy_fetch_url = https://10.192.88.27:8089
disabled = 0
mgmt_uri = https://lelsplunksh03.lehi.micron.com:8089
pass4SymmKey = $1$m5u8/t4toaFEoGHz
shcluster_label = leshcluster01
id = 3C3740AF-9647-442D-BB08-9AE318070A85
[clustering]
search_server = https://10.192.88.157:8089
mode = searchhead
pass4SymmKey = $1$mfu4/tS1GkFIot0z
And I've changed my distsearch.conf file too:
[splunk@lelsplunksh03 ~]$ cat ~/etc/system/local/distsearch.conf
[distributedSearch]
servers = https://10.192.88.157:8089
Hi,
I think I know why my instruction isn't 100% correct on your case.
I had a deployment of 3 SH's (clustered) joining 2 Indexers (clustered).
Besides that I had a server with "Master" functionality for the indexer-cluster
and a "Deployer" for the search-head cluster.
Hope you don't get me wrong on this, but you do have configured a deployer for your Searchhead-Cluster before right?
conf_deploy_fetch_url = https://10.192.88.27:8089 (this should be the deployer)
I think you have so lets continue:
[clustering]
search_server= https://indexer:8089
mode = searchhead
pass4SymmKey = e.g.:splunkisawesome
This part of my answer I'm not sure about if it is applicable to your case.
Maybe try again the CLI command:
splunk add search-server <scheme>://<host>:<port> -auth <user>:<password> -remoteUsername <user> -remotePassword <passremote>
Yes, I have a dedicated deployer for my search head cluster.
I did already remove the [clustering] stanza and that error when away. I'm not doing any clustering on the indexer, I actually only have one at the moment. I did just do that splunk add search-server command on all three search head members and now I'm back to the original problem: I get no results when I search.
I'm spinning up a test indexer right now and I'm going to switch my search head cluster over to that to see if I get the same problem. If I don't then I know it has something to do with my original indexer.
Hmmmm "sh#t" this sound really wierd 😞
I was looking into the config files today, tried to figure out why its more difficult to connect non-cluster to cluster than two clusterd environments and I find it kind of sad, that the splunk docs are in that regard very uninformative about this task.
Hope the test-indexer brings in any new insights! I would love to hear about your solution, pls keep me (us) updated!
Well this is embarrassing, looks like the search head members were connected to the indexer and were able to get data back, but the query I was using to test was not working.
When I run the below search, from the web gui, from the search & reporting app, I get nothing on the search head cluster:
host=lelsyslog*
But, if I run this then I do get results back:
index=* host=lelsyslog*
Why would my cluster refuse to return data when I have one parameter in the search? On top of that, when I run the same search (host=lelsyslog*) directly on the indexer, it does return data.
What tipped me off to this was when I setup the test indexer. I was getting the exact same problem. I setup two quick VM's; one as an indexer, the other to put a forwarder on. After I setup the forwarder and created a simple app to just grab /var/log/messages, I wasn't seeing any data. I thought it might have been because I didn't set something up right, but on a whim, I searched off of index=* and I got results back. I then searched off host and got nothing back, even though that exact host was in the results when I searched off index.
Can someone tell me how to fix this, it almost seems like a configuration issue.
Yes, there is a distsearch.conf in the $SplunkHome/etc/system/local/ directory on each search head:
[splunk@lelsplunksh02 ~]$ cat ~/etc/system/local/distsearch.conf
[distributedSearch]
servers = https://lelsplunkix01.lehi.micron.com:8089
They are exactly the same on each search head. I'm going to try your next steps to see if that fixes the problem.