Getting Data In

Splunk getting the hostname wrong from ESXi hosts


I have Splunk crawling a /logs directory, which is where it receives most of its data. (/logs is populated via syslog-ng). In inputs.conf, I set host_ segment = 2 so that the hostname will be set to the second segment in the path. This has been working fine most of the time. Here is the inputs.conf stanza:

 disabled = false
 sourcetype = syslog
 host_segment = 2
 blacklist = \.(bz2|gz)$

But suddenly I'm noticing some strange hostnames on my indexer... "list_ primary_ nodes", "add_ aam_ node", "find_ active_ primary", "shut_ down_ vmap_ proce" etc... I noticed that they're all coming from a series of new servers that have been sending logs to Splunk: ESXi hosts.

The log path is correct:

Here is an event:

09/14/11 19:31:56 [shut_ down_ vmap_ proce] attempt to stop VMap_ HOSTNAME failed.

So why is it not setting the hostname to HOSTNAME? Why is it setting it to this other hostname that it's capturing from this, albeit unusual-looking, syslog?



It may be possible that a props.conf and/or transforms.conf are resetting the host to something extracted from the event. Check the $SPLUNK_HOME/systems/default/props and transforms files. The regex's on my install look like its pulling syslog host from the insides of the [] in your event, which can be a valid way for syslog to output the hostname. Check the RFC for standard syslog output (which apparently esxi doesn't comply to).

I don't think your host_segment actually does anything, especially since the sourcetype is syslog, which gets a forced host any way.

You may want to do something like having syslog dump the esxi hosts to "/logs/esxi" and doing a props on the source (ie [source...esxi.]) and set the host name with a regex that way.


If this has answered your question, please accept it. Thanks!

