Hello splunkers.
We have about 50 clients registered at our Forwarder Management.
However, since a couple days ago, only 18 of them are listed in the Forwarder Management.
The command $SPLUNK_HOME/bin/splunk list deploy-clients
return the same 18 clients.
However, I can see data from the other 32 forwarders beeing indexed at Splunk.
As a test, I ran the following command $SPLUNK_HOME/bin/splunk set deploy-poll IP:8089 -auth user:passwd
in one of the clients that is not listed in Forwarder Management, received the the Configuration updated message and restarted it.
However, the client is still not listed on Forwarder Management, and I need to push new apps to it.
Any ideas?
Regards,
GMA
Fixed it.
The UF certificate was expired.
When upgrading the UF I received the following messages:
_It seems that the Splunk default certificates are being used. If certificate validation is turned on using the default certificates (not-recommended), this may result in loss of communication in mixed-version Splunk environments after upgrade.
"/opt/splunkforwarder/etc/auth/ca.pem": certificate renewed
"/opt/splunkforwarder/etc/auth/cacert.pem": certificate renewed
"/opt/splunkforwarder/etc/auth/server.pem": certificate renewed_
After that, the clients were listed again in the Forwarder Management.
I had the same problem, and I discovered that the same GUID was being sent by multiple Deployment clients. This is because we use AWS AMI's and the ID file is part of the common configuration.
I had to update our install scripts to remove the /opt/splunkforwarder/etc/instance.cfg file. When Splunk starts up, it is recreated automatically.
See also: https://answers.splunk.com/answers/542872/what-do-i-look-at-in-splunkdlog-to-troubleshoot-de.html
Fixed it.
The UF certificate was expired.
When upgrading the UF I received the following messages:
_It seems that the Splunk default certificates are being used. If certificate validation is turned on using the default certificates (not-recommended), this may result in loss of communication in mixed-version Splunk environments after upgrade.
"/opt/splunkforwarder/etc/auth/ca.pem": certificate renewed
"/opt/splunkforwarder/etc/auth/cacert.pem": certificate renewed
"/opt/splunkforwarder/etc/auth/server.pem": certificate renewed_
After that, the clients were listed again in the Forwarder Management.
Hi guimilare!
The UF logs should steer you to the reason they are not contacting the DS.
First, double check your ability to reach the DS from those forwarders using telnet on port 8089.
Then you can check from the UF itself by navigating to $SPLUNK_HOME/var/log/splunk
and running tail -f
or tail -f splunkd.log | grep HttpPubSubConnection
Here is a working UF calling DS:
06-29-2017 14:27:44.875 +0000 INFO HttpPubSubConnection - Running phone...
or from the search gui, if you are receiving _internal
logs from these impacted UFs: index=_internal HttpPubSubConnection
Then you can check from the Deployment Server perspective in Splunk index=_internal source=*splunkd.log pubsubsvr OR deploymentserver
or again from the splunkd.log
The finally, btool is your friend! Double check your UF configs
./splunk btool deploymentclient list --debug
The hosts that are not listed in Forwarder Management, the result I get from the search index=_internal HttpPubSubConnection
is:
06-29-2017 18:01:00.075 +0000 WARN HttpPubSubConnection - Unable to parse message from PubSubSvr: 06-29-2017
18:01:00.075 +0000 INFO > HttpPubSubConnection - Could not obtain connection, will retry after=79 seconds.
hmm, can you telnet to 8089?
Yes, I can telnet from UF to DS on port 8089
Can you check btool output on the UF?
./splunk btool deploymentclient list --debug
Need to make sure the phone home URI is correct.
This is the result:
$ splunk btool deploymentclient list --debug
/opt/splunkforwarder/etc/system/local/deploymentclient.conf [target-broker:deploymentServer]
/opt/splunkforwarder/etc/system/local/deploymentclient.conf targetUri = 10.217.XX.XXX:8089
The IP is correct, this is the DS IP.
Mine looks like this, fwiw:
[splunker@n00b-splkufw-01 bin]$ ./splunk btool deploymentclient list --debug
/opt/splunkforwarder/etc/apps/n00blab_all_forwarder_deploymentclient/local/deploymentclient.conf [deployment-client]
/opt/splunkforwarder/etc/apps/n00blab_all_forwarder_deploymentclient/local/deploymentclient.conf [target-broker:deploymentServer]
/opt/splunkforwarder/etc/apps/n00blab_all_forwarder_deploymentclient/local/deploymentclient.conf targetUri = 10.10.x.x:8089
not sure if you just omitted the [deployment-client]
stanza in ur paste.
Can we compare the output to one of the UF that is properly calling the DS?
Also, what does the pubsvr or DeploymentServer internal logs show you ?