Hi Splunkers,
I have a list of servers that have the Splunk UF running on them. These servers are not showing up in my Deployment server.
I have verified that the deployment server is enabled with:
/opt/splunk/bin/splunk display deploy-server
I have also verified that my UFs are pointing to the correct IP Address/Port 8089 of my Deployment Server UI with:
/opt/splunkforwarder/bin/splunk show deploy-poll
Finally, I have tested a telnet connection from one of these "problem" UFs to my Deployment Server over port 8089. The telnet connection to the Deployment Server was successful.
I have asked the UNIX team here to send me a copy of Splunkd.log from one of the "problem" servers. What should I be looking for in this file that would clearly show connection issues between the UF and the deployment server?
Are there any other troubleshooting steps I should try besides what I've already done? I'm trying to sort out whether this is a Splunk issue vs. something else on the network causing the issue.
Ps - Even though these UFs aren't showing up in my Deployment Server, they ARE successfully sending logs to my indexers over TCP 9997.
So I've been poking around a bit further and I think this might be a duplicate GUID issue.
I believe I have a bunch of servers in my environment with the same GUID set in instance.cfg. This probably resulted from cloning VMs when the servers were built.
By default, the "client name" that shows up when a forwarder pops into the Deployment Server is set to that deployment client's GUID. Since we have a whole bunch of servers with the same GUID, it is likely that only one of them would show up and the rest won't (since only one server can have that specific GUID at any one time)
I'm seeing a bunch of events in splunkd.log on my deployment server that show the properties changing every time systems with the duplicate GUID phone home. Here's an example:
05-30-2017 13:56:02.465 -0400 WARN ClientSessionsManager - Client with Id has changed some of its properties on the latest phone home.Old properties are: ip= dns= hostname= build=67571ef4b87d uts=linux-x86_64 name=. New properties are: ip= dns= build=67571ef4b87d uts=linux-x86_64 name=.
I'm having one of our admins delete instance.cfg from /opt/splunkforwarder/etc/ and restart the splunk forwarder on one of the servers with the duplicate GUID to see if this fixes it by spawning a new, unique GUID.
So I've been poking around a bit further and I think this might be a duplicate GUID issue.
I believe I have a bunch of servers in my environment with the same GUID set in instance.cfg. This probably resulted from cloning VMs when the servers were built.
By default, the "client name" that shows up when a forwarder pops into the Deployment Server is set to that deployment client's GUID. Since we have a whole bunch of servers with the same GUID, it is likely that only one of them would show up and the rest won't (since only one server can have that specific GUID at any one time)
I'm seeing a bunch of events in splunkd.log on my deployment server that show the properties changing every time systems with the duplicate GUID phone home. Here's an example:
05-30-2017 13:56:02.465 -0400 WARN ClientSessionsManager - Client with Id has changed some of its properties on the latest phone home.Old properties are: ip= dns= hostname= build=67571ef4b87d uts=linux-x86_64 name=. New properties are: ip= dns= build=67571ef4b87d uts=linux-x86_64 name=.
I'm having one of our admins delete instance.cfg from /opt/splunkforwarder/etc/ and restart the splunk forwarder on one of the servers with the duplicate GUID to see if this fixes it by spawning a new, unique GUID.
Confirmed: This issue was definitely caused by a duplicate GUID in /opt/splunkforwarder/etc/instance.cfg on multiple Universal Forwarders in my environment.
Deleting instance.cfg from /opt/splunkforwarder/etc/ and restarting the splunk forwarder spawns a new instance.cfg with a unique GUID - I then see the forwarder as a deployment client on my deployment server.
Do you know what causes all the systems to have the same GUID? I think I have the same issue. testing now,
Cheers i had the same issue 🙂
Great! I learned about that, thank you 🙂
If you can receive log from UF, is that possible searching the splunkd.log of UF from Splunk instance?
Or you only can look up the data in the client?
And the best practice to clone a VM with UF is to delete the instance.cfg manual at first?
Not sure about the best practice with cloning a VM, but my hunch is that generally, you'd want to clean out any config files that uniquely identify an endpoint so that you would avoid duplication and/or conflicts.
It is possible to search splunkd.log data from a Universal Forwarder from your Splunk Enterprise instance. I think by default, the forwarders should be allowed to forward events from splunkd.log. The index that captures the splunkd.log data should be "_internal".
I found more information re:
"And the best practice to clone a VM with UF is to delete the instance.cfg manual at first?"
Please check out this article on making a universal forwarder part of a system image:
http://docs.splunk.com/Documentation/Splunk/6.3.1/Forwarding/Makeadfpartofasystemimage
The basic summary is:
After installing the universal forwarder on your Gold image source machine,
*This clears instance-specific information, such as the server name and GUID, from the forwarder. This information will then be configured on each cloned forwarder at initial start-up.
Hope this will help:)
Does anyone know what causes this. I have never seen this before until now. Unix team deployed UF to around 20 systems and they all had the same GUID.
from your description, it seems like you are doing everything ok.
did you restart forwarder after setting deploymentclient.conf? /opt/splunkforwarder/bin/splunk set deploy-poll <deplyment.server.company.com:8089>
also read here:
http://docs.splunk.com/Documentation/Splunk/latest/Updating/Configuredeploymentclients
and here:
https://wiki.splunk.com/Deploy:DeploymentServer
I did restart the forwarder after setting deploymentclient.conf. No luck unfortunately.
I also turned off iptables, and restarted Splunkd on my deployment server...
Still nothing....
do you see other forwarders on your DS forwarder management screen?
I see other forwarders on my DS in forwarder management (431 clients currently). I believe this is an issue with a handful of servers.....