RHEL7, Splunk/forwarder v8.0.4
I'm setting up a distributed installation (1x head, 2x indexer). There's been quite a bit of back and fourth, troubleshooting.
When running 'splunk restart' 2 of 3 manages to start up the web interface as desired, with the correct CA showing up in the browser.
For the remaining one, the config file /opt/splunk/etc/system/local/web.conf looks identical on them.
Another config file, ~/etc/system/local/server.conf, is similar, with serverName, and the hashed pass4SymmKey and sslPassword being different. This is also using the .pem file as serverCert.
Rather than the decrypted .key file, the server.conf file is running of the encrypted one (in .pem format), but sslPassword being supplied in the [sslConfig] section.
My current question is, what configuration files affects the web interface?
When the web interface is up (and the second indexer hopefully shows up in 'splunk show cluster-bundle-status', replication and data integrity would be next, before in the end, having all forwarders show up. I have a feeling/hope all the current issues are related to me messing up SSL stuff.
If this is the wrong place to ask/post this, I do apologize.
Sorry about the delay here.
Found a "start to finish" guide for a slightly older version, and started over with that. One missing element was to enable "distributed" on the head-server. It's running 1 head and 2 indexers. While this might not be ideal, it seems sufficient for now.
In short, the guide used ./splunk <command>, rather than modifying all config files with a text editor.
Thanks for all your guidance, as I learn more, I'll look into whether more distributed setups would be beneficial here.
This is considered solved, although beyond the GUI switch from "standalone" to "distributed" mode, I can't say exactly what solved it. Several mistakes were cleared up, regarding which configuration files were in use/prioritized and such.
The error experienced looks similar to this:
Waiting for web server at https://127.0.0.1:8000 to be available...(....)
WARNING: web interface does not seem to be available!
splunkd.log contains entries like this:
08-03-2020 13:04:49.883 +0200 WARN HttpPubSubConnection - Unable to parse message from PubSubSvr:
08-03-2020 13:04:49.884 +0200 INFO HttpPubSubConnection - Could not obtain connection, will retry after=68.922 seconds.
08-03-2020 13:04:59.474 +0200 INFO DC:DeploymentClient - channel=tenantService/handshake Will retry sending handshake message to DS; err=not_connected
All or most of these does however also appear on the servers where the web interface seems okay, and 'splunk restart' runs as expected.
Hi
this:
08-03-2020 13:04:59.474 +0200 INFO DC:DeploymentClient - channel=tenantService/handshake Will retry sending handshake message to DS; err=not_connected
Refers that your server cannot connect to DS and as said on next answer it's actually ERROR not INFO level message.
You said that you have one SH and two indexers. Have you also DS and CM or are all those standalone servers?
r. Ismo
Hi!
First off, thanks for the reply!
The "head" server contains everything except the indexing part.
I checked the link you provided, but can't see any issues with the networking so far.
Index1
[splunk@index1 ~]# splunk btool --debug deploymentclient list
/opt/splunk/etc/system/local/deploymentclient.conf [target-broker:deploymentServer]
/opt/splunk/etc/system/local/deploymentclient.conf targetUri = head:9887
[splunk@index1 ~]# nc -vz head 9887
Ncat: Version 7.50 ( https://nmap.org/ncat )
Ncat: Connected to head_ip:9887.
Ncat: 0 bytes sent, 0 bytes received in 0.01 seconds.
Index2
[splunk@index2 ~]$ splunk btool --debug deploymentclient list
/opt/splunk/etc/system/local/deploymentclient.conf [target-broker:deploymentServer]
/opt/splunk/etc/system/local/deploymentclient.conf targetUri = head:9887
[splunk@index2 ~]$ nc -vz head 9887
Ncat: Version 7.50 ( https://nmap.org/ncat )
Ncat: Connected to index2_ip:9887.
Ncat: 0 bytes sent, 0 bytes received in 0.01 seconds.
If the same is attempted with OpenSSL, this is however the result (or a small part of it anyway):
Certificate chain
0 s:/CN=SplunkServerDefaultCert/O=SplunkUser
i:/C=US/ST=CA/L=San Francisco/O=Splunk/CN=SplunkCommonCA/emailAddress=support@splunk.com
Verify return code: 19 (self signed certificate in certificate chain)
I'm using certificates from another CA (defined in several .conf files).
In server.conf, I've got serverCert (.pem) and sslRootCAPath (.pem), along with sslPassword (hashed password). The Splunk CA isn't mentioned in either of these.
With some investigation, the certificate that's used here, is in the files located here:
/opt/splunk/etc/auth/
Mine are located here: /opt/splunk/etc/apps/ssl/
Where the default SSL is used
I see the default ones listed in several files, although I'm not sure how many are of relevance.
/opt/splunk/etc/system/default/server.conf:caCertFile = $SPLUNK_HOME/etc/auth/cacert.pem
I believe this has priority under /local/, thus not relevant. README dirs seems irrelevant too.
This one might be of importance, although I'm not sure when it's generated/updated, and if it's collecting info from server.conf or another file.
/opt/splunk/var/run/splunk/merged/server.conf:caCertFile = $SPLUNK_HOME/etc/auth/cacert.pem
This got a bit long, mainly to (hopefully) avoid misunderstandings. I'm rather new to this and may be way off in my thinking here..
Hi
Can you try this:
targetUri= <uri>
* URI of the deployment server.
* An example of <uri>: <scheme>://<deploymentServer>:<mgmtPort>
targetUri = https://head:9887
Personally I prefer FQDN (host.dom.ain) instead of hostname.
Are those indexers are individual not part of indexing cluster?
I saw that you have configured that DS part directly to etc/system/local/..... Best practice is to use separate TA for delivering those configurations to all servers.
r. Ismo
Forgot to mention that - FQDNs are used, but not included here.
They should be clustered - if I've configured that correctly, I am not sure.
I've used this on the indexers:
splunk edit cluster-config -mode slave -master_uri (...)
And this on head:
splunk edit cluster-config -mode searchhead -master_uri
[clustering]
master_uri = https://head_fqdn:8089
mode = slave
pass4SymmKey = <string>
~/etc/system/local/server.conf @ index1 and 2
Running 'splunk list cluster-peers' on head tells be that index1 has status "Up", but index2 is not listed.
The /local/ path is currently used as, well, it worked. I would like to follow best practices though - 'TA' - what's that standing for? I have a basic understanding of some of the components, but am far from fluent.
If/when those are clustered never configure those with DS! Only allowed way is deploy those with CM. And same for SHC members (but you haven't those).
Now I'm little bit confused of your architecture 😞
Here the installation means own binaries under /opt/splunk/xxx or somewhere else.
Is my assumptions correct or what kind of setup you have? And is this already in "production" use or can this do from scratch again if needed?
r. Ismo
If I understand this correctly, clustered indexers should not be configured with deployment server (DS)? And they need to be done with cluster master/manager (CM)?
A total of three hosts/nodes: head, index1 and index2.
My theory was that having head as the "master" doing everything aside from indexing was a logical move.
I am not sure if there's some place in the web interface or elsewhere to see a list of all active roles a node has.
Each node has its own installation under /opt/splunk/, but configured with different roles on the installation.
I am not sure if I'm getting all the abbreviations correct. The setup is not yet in production - I want it fully operational before I declare it ready.
Hi
There are some links which you should read through:
You could found couple of more instructions easily if needed.
But shortly what those said:
You cannot share severs between: CM, Peers, SH or DS. This basically means that minimum what you are needing is 4-5 depending of how many clients you will have. If you have more that 50 then also that must be a separate server. If you have anything else than windows clients then you must run DS on linux. It can handle all clients, but windows DS can handle only windows clients!
When you have a distributed environment I also propose that you will install MC (monitoring console) to watch and alerts your Splunk environment. And don't install it to normal SH.
Based on that I would recommended minimum the next:
Depending of your data volumes all or some of those could be virtual machines or from cloud e.g. AWS.
One other reference for you is Splunk Validated Architecture, which presents Splunk's recommendations (https://www.splunk.com/en_us/blog/tips-and-tricks/splunk-validated-architectures.html). There are also some .conf presentations of it.
I hope that these instructions and documentation helps you to get your splunk environment up and running!
r. Ismo
Sorry about the delay here.
Found a "start to finish" guide for a slightly older version, and started over with that. One missing element was to enable "distributed" on the head-server. It's running 1 head and 2 indexers. While this might not be ideal, it seems sufficient for now.
In short, the guide used ./splunk <command>, rather than modifying all config files with a text editor.
Thanks for all your guidance, as I learn more, I'll look into whether more distributed setups would be beneficial here.
This is considered solved, although beyond the GUI switch from "standalone" to "distributed" mode, I can't say exactly what solved it. Several mistakes were cleared up, regarding which configuration files were in use/prioritized and such.