Solved: How to properly connect a search head cluster to a...

zipmaster07 · ‎10-13-2016

I'm having a very hard time connecting my search head cluster to my search peer. I have stepped through the search head documentation very carefully located here: http://docs.splunk.com/Documentation/Splunk/6.5.0/DistSearch/SHCdeploymentoverview
I have successfully installed my deployer and added the [shclustering] stanza to the /opt/splunk/etc/system/local/server.conf file and added the pass4SymmKey and shcluster_label.

I then ran splunk init shcluster-config on each of my search head members and restarted Splunk. Each one ran successfully without any reported errors. I'm also able to run splunk bootstrap shcluster-captain without any issues and splunk show shcluster-status doesn't report any problems:

[splunk@lelsplunksh02 ~]$ splunk show shcluster-status

 Captain:
                          dynamic_captain : 1
                          elected_captain : Thu Oct 13 15:48:05 2016
                                       id : C2403815-55A2-413E-AF26-4998CFD9508F
                         initialized_flag : 1
                                    label : lelsplunksh03
                         maintenance_mode : 0
                                 mgmt_uri : https://splunkserver:8089
                    min_peers_joined_flag : 1
                     rolling_restart_flag : 0
                       service_ready_flag : 1

 Members:
        lelsplunksh02
                                    label : lelsplunksh02
                                 mgmt_uri : https://splunkserver:8089
                           mgmt_uri_alias : https://xx.xxx.xx.xxx:8089
                                   status : Up
        lelsplunksh04
                                    label : lelsplunksh04
                                 mgmt_uri : https://splunkserver:8089
                           mgmt_uri_alias : https://xx.xxx.xx.xxx:8089
                                   status : Up
        lelsplunksh03
                                    label : lelsplunksh03
                                 mgmt_uri : https://splunkserver:8089
                           mgmt_uri_alias : https://xx.xxx.xx.xxx:8089
                                   status : Up

My problem starts when I try to add my search peer. I only have one indexer and I'm following this doc: http://docs.splunk.com/Documentation/Splunk/6.5.0/DistSearch/Connectclustersearchheadstosearchpeers

I'm running:

splunk add search-server https://splunkserver:8089 -auth admin:pswd -remoteUsername admin -remotePassword pswd

This also runs successfully, but I'm just not getting any results when I connect to my search head and run a search. I can run the exact same search on the indexer itself and it returns results. I can't see any errors in logs on either the indexer or the search head members.

Any help would be appreciated to point me in the right direction.

horsefez · ‎10-13-2016

Hi there zipmaster,

first of I had my own problems with searchhead <-> indexer connection. It's easy to make a mistake here.

After the execution of the command there should be a distsearch.conf in the $SplunkHome/etc/system/local

Could you tell me if there is one?
If yes, could you maybe post it's content here, too?

Thanks!
pyro_wood

(PS: in approx. 1.5 h I'm at work, so I will post you parts of my own guide where I tried to setup distributed search)

EDIT:

If you would like to delete existing cluster config on a search head (to beginn from start) do the following:

On every SH do the following commands:
splunk remove shcluster-member
(wait approx. 1 minute)
splunk stop
splunk clean all
splunk start

Now you should have clean SH's without cluster config.

Initiate SH-Cluster config:
Go on every SH in server.conf and post the following (alter the config for every sh)

[shclustering]
conf_deploy_fetch_url = https://deployer:8089
disabled = 0
mgmt_uri = https://sh1:8089
pass4SymmKey = e.g.:splunkisawesome
shcluster_label = e.g.:SH-Cluster_1

restart splunk afterwards
restart splunk

Initialize Cluster-Captain with this command:
splunk bootstrap shcluster-captain -servers_list "https://sh1:8089,https://sh2:8089,https://sh3:8089"

(it takes a while)

then do:
splunk show shcluster-status

Next steps:
Go on every Search-Head and create a Stanza called [clustering] in server.conf:

[clustering]
search_server= https://indexer:8089
mode = searchhead
pass4SymmKey = e.g.:splunkisawesome

Then execute:
restart splunk

Try it out! Sometime this does the trick already.

If not... and I don't know why this only happens occasionaly do these steps as well:

Now you need to setup authentication for the Indexers:

Copy via scp (or other) every "trusted.pem" from every SH:
/opt/splunk/etc/auth/distServerKeys/trusted.pem

to the indexers into the corresponding file:
$SplunkHome/etc/auth/distServerKeys//
$SplunkHome/etc/auth/distServerKeys//
$SplunkHome/etc/auth/distServerKeys//

(if those directories arent there create them)

Restart indexer
splunk restart

HOPE THIS HELPS 😉
Just ask, if you have any further questions.

View solution in original post

Steve_G_ · ‎10-17-2016

I downvoted this post because the problem that the user had was not with establishing connectivity between search heads and search peers. rather, the problem was with the query used to test connectivity. the accepted answer provided a procedure tht, if followed by others, would cause them to completely rebuild their search head cluster, among other issues.

this answer should be removed, as it could easily mislead anyone who does have search peer connectivity issues.

Steve_G_ · ‎10-14-2016

You might want to double check the management port on your indexer, to make sure that it's 8089.

See http://docs.splunk.com/Documentation/Splunk/6.5.0/Admin/Changedefaultvalues#Use_Splunk_Web_2

zipmaster07 · ‎10-14-2016

Yes, the indexer is listening on 8089. This is my production indexer:

[splunk@lelsplunkix01 ~]$ netstat -tupan | grep 8089
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
tcp        0      0 0.0.0.0:8089            0.0.0.0:*               LISTEN      4015/splunkd
tcp        0      0 10.192.88.157:8089      10.192.88.149:9393      ESTABLISHED 4015/splunkd
tcp        0      0 10.192.88.157:8089      10.192.88.105:13056     ESTABLISHED 4015/splunkd
tcp        0      0 10.192.88.157:46334     10.192.88.156:8089      ESTABLISHED 4015/splunkd

I can telnet to that server on that port as well from my search head members.

horsefez · ‎10-13-2016

Hi there zipmaster,

first of I had my own problems with searchhead <-> indexer connection. It's easy to make a mistake here.

After the execution of the command there should be a distsearch.conf in the $SplunkHome/etc/system/local

Could you tell me if there is one?
If yes, could you maybe post it's content here, too?

Thanks!
pyro_wood

(PS: in approx. 1.5 h I'm at work, so I will post you parts of my own guide where I tried to setup distributed search)

EDIT:

If you would like to delete existing cluster config on a search head (to beginn from start) do the following:

On every SH do the following commands:
splunk remove shcluster-member
(wait approx. 1 minute)
splunk stop
splunk clean all
splunk start

Now you should have clean SH's without cluster config.

Initiate SH-Cluster config:
Go on every SH in server.conf and post the following (alter the config for every sh)

[shclustering]
conf_deploy_fetch_url = https://deployer:8089
disabled = 0
mgmt_uri = https://sh1:8089
pass4SymmKey = e.g.:splunkisawesome
shcluster_label = e.g.:SH-Cluster_1

restart splunk afterwards
restart splunk

Initialize Cluster-Captain with this command:
splunk bootstrap shcluster-captain -servers_list "https://sh1:8089,https://sh2:8089,https://sh3:8089"

(it takes a while)

then do:
splunk show shcluster-status

Next steps:
Go on every Search-Head and create a Stanza called [clustering] in server.conf:

[clustering]
search_server= https://indexer:8089
mode = searchhead
pass4SymmKey = e.g.:splunkisawesome

Then execute:
restart splunk

Try it out! Sometime this does the trick already.

If not... and I don't know why this only happens occasionaly do these steps as well:

Now you need to setup authentication for the Indexers:

Copy via scp (or other) every "trusted.pem" from every SH:
/opt/splunk/etc/auth/distServerKeys/trusted.pem

to the indexers into the corresponding file:
$SplunkHome/etc/auth/distServerKeys//
$SplunkHome/etc/auth/distServerKeys//
$SplunkHome/etc/auth/distServerKeys//

(if those directories arent there create them)

Restart indexer
splunk restart

HOPE THIS HELPS 😉
Just ask, if you have any further questions.

horsefez · ‎10-13-2016

Maybe also have a look here
http://docs.splunk.com/Documentation/Splunk/6.5.0/DistSearch/Configuredistributedsearch#Edit_distsea...

zipmaster07 · ‎10-14-2016

I think i'm getting closer. I've gone through the above steps and I'm getting an error now when I connect to the web interface of my search head cluster:

The searchhead is unable to update the peer information. Error = 'Unable to reach master'.

And I can see this in the log on the search head captain:

10-14-2016 09:24:21.121 -0600 ERROR ClusteringMgr - VerifyMultisiteConfig failed Error=failed method=GET path=/services/cluster/master/info/?output_mode=json master=? rv=0 actual_response_code=502 expected_response_code=200 status_line="Error resolving: Name or service not known" socket_error="Cannot resolve hostname"

I've switch to using all IP addresses, so I'm not sure what hostname it cannot resolve. Here is what one of my server.conf files is currently looking like at the moment:

[splunk@lelsplunksh03 ~]$ cat ~/etc/system/local/server.conf
[general]
serverName = lelsplunksh03
pass4SymmKey = $1$lwa1+e7fvdG8

[sslConfig]
sslKeysfilePassword = $1$wbrhsiub9Bgw

[lmpool:auto_generated_pool_download-trial]
description = auto_generated_pool_download-trial
quota = MAX
slaves = *
stack_id = download-trial

[lmpool:auto_generated_pool_forwarder]
description = auto_generated_pool_forwarder
quota = MAX
slaves = *
stack_id = forwarder

[lmpool:auto_generated_pool_free]
description = auto_generated_pool_free
quota = MAX
slaves = *
stack_id = free

[replication_port://34567]

[shclustering]
conf_deploy_fetch_url = https://10.192.88.27:8089
disabled = 0
mgmt_uri = https://lelsplunksh03.lehi.micron.com:8089
pass4SymmKey = $1$m5u8/t4toaFEoGHz
shcluster_label = leshcluster01
id = 3C3740AF-9647-442D-BB08-9AE318070A85

[clustering]
search_server = https://10.192.88.157:8089
mode = searchhead
pass4SymmKey = $1$mfu4/tS1GkFIot0z

And I've changed my distsearch.conf file too:

[splunk@lelsplunksh03 ~]$ cat ~/etc/system/local/distsearch.conf
[distributedSearch]
servers = https://10.192.88.157:8089

horsefez · ‎10-14-2016

Hi,

I think I know why my instruction isn't 100% correct on your case.

I had a deployment of 3 SH's (clustered) joining 2 Indexers (clustered).
Besides that I had a server with "Master" functionality for the indexer-cluster
and a "Deployer" for the search-head cluster.

Hope you don't get me wrong on this, but you do have configured a deployer for your Searchhead-Cluster before right?
conf_deploy_fetch_url = https://10.192.88.27:8089 (this should be the deployer)

I think you have so lets continue:

[clustering]
search_server= https://indexer:8089
mode = searchhead
pass4SymmKey = e.g.:splunkisawesome

This part of my answer I'm not sure about if it is applicable to your case.
Maybe try again the CLI command:

splunk add search-server <scheme>://<host>:<port> -auth <user>:<password> -remoteUsername <user> -remotePassword <passremote>

zipmaster07 · ‎10-14-2016

Yes, I have a dedicated deployer for my search head cluster.

I did already remove the [clustering] stanza and that error when away. I'm not doing any clustering on the indexer, I actually only have one at the moment. I did just do that splunk add search-server command on all three search head members and now I'm back to the original problem: I get no results when I search.

I'm spinning up a test indexer right now and I'm going to switch my search head cluster over to that to see if I get the same problem. If I don't then I know it has something to do with my original indexer.

horsefez · ‎10-14-2016

Hmmmm "sh#t" this sound really wierd 😞
I was looking into the config files today, tried to figure out why its more difficult to connect non-cluster to cluster than two clusterd environments and I find it kind of sad, that the splunk docs are in that regard very uninformative about this task.

Hope the test-indexer brings in any new insights! I would love to hear about your solution, pls keep me (us) updated!

zipmaster07 · ‎10-14-2016

Well this is embarrassing, looks like the search head members were connected to the indexer and were able to get data back, but the query I was using to test was not working.

When I run the below search, from the web gui, from the search & reporting app, I get nothing on the search head cluster:

host=lelsyslog*

But, if I run this then I do get results back:

index=* host=lelsyslog*

Why would my cluster refuse to return data when I have one parameter in the search? On top of that, when I run the same search (host=lelsyslog*) directly on the indexer, it does return data.

What tipped me off to this was when I setup the test indexer. I was getting the exact same problem. I setup two quick VM's; one as an indexer, the other to put a forwarder on. After I setup the forwarder and created a simple app to just grab /var/log/messages, I wasn't seeing any data. I thought it might have been because I didn't set something up right, but on a whim, I searched off of index=* and I got results back. I then searched off host and got nothing back, even though that exact host was in the results when I searched off index.

Can someone tell me how to fix this, it almost seems like a configuration issue.

zipmaster07 · ‎10-14-2016

Yes, there is a distsearch.conf in the $SplunkHome/etc/system/local/ directory on each search head:

[splunk@lelsplunksh02 ~]$ cat ~/etc/system/local/distsearch.conf
[distributedSearch]
servers = https://lelsplunkix01.lehi.micron.com:8089

They are exactly the same on each search head. I'm going to try your next steps to see if that fixes the problem.

How to properly connect a search head cluster to a search peer?

Continuing Innovation & New Integrations Unlock Full Stack Observability For Your ...

Monitoring Amazon Elastic Kubernetes Service (EKS)

Cloud Platform & Enterprise: Classic Dashboard Export Feature Deprecation