Getting Data In

What do I look at in splunkd.log to troubleshoot deployment client issues?

vanderaj2
Path Finder

Hi Splunkers,

I have a list of servers that have the Splunk UF running on them. These servers are not showing up in my Deployment server.

I have verified that the deployment server is enabled with:

/opt/splunk/bin/splunk display deploy-server

I have also verified that my UFs are pointing to the correct IP Address/Port 8089 of my Deployment Server UI with:

/opt/splunkforwarder/bin/splunk show deploy-poll

Finally, I have tested a telnet connection from one of these "problem" UFs to my Deployment Server over port 8089. The telnet connection to the Deployment Server was successful.

I have asked the UNIX team here to send me a copy of Splunkd.log from one of the "problem" servers. What should I be looking for in this file that would clearly show connection issues between the UF and the deployment server?

Are there any other troubleshooting steps I should try besides what I've already done? I'm trying to sort out whether this is a Splunk issue vs. something else on the network causing the issue.

Ps - Even though these UFs aren't showing up in my Deployment Server, they ARE successfully sending logs to my indexers over TCP 9997.

1 Solution

vanderaj2
Path Finder

So I've been poking around a bit further and I think this might be a duplicate GUID issue.

I believe I have a bunch of servers in my environment with the same GUID set in instance.cfg. This probably resulted from cloning VMs when the servers were built.

By default, the "client name" that shows up when a forwarder pops into the Deployment Server is set to that deployment client's GUID. Since we have a whole bunch of servers with the same GUID, it is likely that only one of them would show up and the rest won't (since only one server can have that specific GUID at any one time)

I'm seeing a bunch of events in splunkd.log on my deployment server that show the properties changing every time systems with the duplicate GUID phone home. Here's an example:

05-30-2017 13:56:02.465 -0400 WARN ClientSessionsManager - Client with Id has changed some of its properties on the latest phone home.Old properties are: ip= dns= hostname= build=67571ef4b87d uts=linux-x86_64 name=. New properties are: ip= dns= build=67571ef4b87d uts=linux-x86_64 name=.

I'm having one of our admins delete instance.cfg from /opt/splunkforwarder/etc/ and restart the splunk forwarder on one of the servers with the duplicate GUID to see if this fixes it by spawning a new, unique GUID.

View solution in original post

vanderaj2
Path Finder

So I've been poking around a bit further and I think this might be a duplicate GUID issue.

I believe I have a bunch of servers in my environment with the same GUID set in instance.cfg. This probably resulted from cloning VMs when the servers were built.

By default, the "client name" that shows up when a forwarder pops into the Deployment Server is set to that deployment client's GUID. Since we have a whole bunch of servers with the same GUID, it is likely that only one of them would show up and the rest won't (since only one server can have that specific GUID at any one time)

I'm seeing a bunch of events in splunkd.log on my deployment server that show the properties changing every time systems with the duplicate GUID phone home. Here's an example:

05-30-2017 13:56:02.465 -0400 WARN ClientSessionsManager - Client with Id has changed some of its properties on the latest phone home.Old properties are: ip= dns= hostname= build=67571ef4b87d uts=linux-x86_64 name=. New properties are: ip= dns= build=67571ef4b87d uts=linux-x86_64 name=.

I'm having one of our admins delete instance.cfg from /opt/splunkforwarder/etc/ and restart the splunk forwarder on one of the servers with the duplicate GUID to see if this fixes it by spawning a new, unique GUID.

vanderaj2
Path Finder

Confirmed: This issue was definitely caused by a duplicate GUID in /opt/splunkforwarder/etc/instance.cfg on multiple Universal Forwarders in my environment.

Deleting instance.cfg from /opt/splunkforwarder/etc/ and restarting the splunk forwarder spawns a new instance.cfg with a unique GUID - I then see the forwarder as a deployment client on my deployment server.

0 Karma

dacosta123
Explorer

Do you know what causes all the systems to have the same GUID? I think I have the same issue. testing now,

0 Karma

robertlynch2020
Motivator

Cheers i had the same issue 🙂

0 Karma

ggssa2000
Explorer

Great! I learned about that, thank you 🙂

If you can receive log from UF, is that possible searching the splunkd.log of UF from Splunk instance?
Or you only can look up the data in the client?

And the best practice to clone a VM with UF is to delete the instance.cfg manual at first?

0 Karma

vanderaj2
Path Finder

Not sure about the best practice with cloning a VM, but my hunch is that generally, you'd want to clean out any config files that uniquely identify an endpoint so that you would avoid duplication and/or conflicts.

It is possible to search splunkd.log data from a Universal Forwarder from your Splunk Enterprise instance. I think by default, the forwarders should be allowed to forward events from splunkd.log. The index that captures the splunkd.log data should be "_internal".

0 Karma

vanderaj2
Path Finder

I found more information re:

"And the best practice to clone a VM with UF is to delete the instance.cfg manual at first?"

Please check out this article on making a universal forwarder part of a system image:
http://docs.splunk.com/Documentation/Splunk/6.3.1/Forwarding/Makeadfpartofasystemimage

The basic summary is:
After installing the universal forwarder on your Gold image source machine,

  1. Stop the universal forwarder.
  2. *Run this CLI command on the forwarder: ./splunk clone-prep-clear-config

*This clears instance-specific information, such as the server name and GUID, from the forwarder. This information will then be configured on each cloned forwarder at initial start-up.

  1. Prep your image or virtual machine, as necessary, for cloning.
  2. Distribute system image or virtual machine clones to machines across your environment and start them.
0 Karma

ggssa2000
Explorer
  1. first try "metadate | type=host" in search cmd, check the the log received from your host.
  2. if yes, but deployment server is not showing up, then the problem is your UF's config, you need to run "/opt/splunkforwarder/bin/splunk set deploy-poll " in your client's directory at c:\program file\universalsplunkforwarder\bin, restarting the UF, and check the deployment file at the $SPLUNL_HOME/etc/system/local, if successful, there is a deployment file exists.
  3. if not, maybe there is no connection from your client to your splunk server, check the splunk server's ports and iptables policy (you can disable iptable temporally, and enable when finishing trobleshooting) and the routing table (sometimes, the connection, like ping or telnet is fine, however is affected by the routing table incorrented)
  4. like above mentioned, check the client's port or host_firewall policy is correct for Splunk Service.

Hope this will help:)

0 Karma

dacosta123
Explorer

Does anyone know what causes this. I have never seen this before until now. Unix team deployed UF to around 20 systems and they all had the same GUID.

0 Karma

adonio
Ultra Champion

from your description, it seems like you are doing everything ok.
did you restart forwarder after setting deploymentclient.conf? /opt/splunkforwarder/bin/splunk set deploy-poll <deplyment.server.company.com:8089>
also read here:
http://docs.splunk.com/Documentation/Splunk/latest/Updating/Configuredeploymentclients
and here:
https://wiki.splunk.com/Deploy:DeploymentServer

0 Karma

vanderaj2
Path Finder

I did restart the forwarder after setting deploymentclient.conf. No luck unfortunately.

I also turned off iptables, and restarted Splunkd on my deployment server...

Still nothing....

0 Karma

adonio
Ultra Champion

do you see other forwarders on your DS forwarder management screen?

0 Karma

vanderaj2
Path Finder

I see other forwarders on my DS in forwarder management (431 clients currently). I believe this is an issue with a handful of servers.....

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

Splunk is officially part of Cisco

Revolutionizing how our customers build resilience across their entire digital footprint.   Splunk ...

Splunk APM & RUM | Planned Maintenance March 26 - March 28, 2024

There will be planned maintenance for Splunk APM and RUM between March 26, 2024 and March 28, 2024 as ...