Getting Data In

Regex transform on a dynamic default host?



I have Splunk on some Windows VMs that run on ESX hosts. For each guest, the ESX hosts generate customized performance data that I'd like to ingest into Splunk as well, in order to correlate the hosts' data with the data reported from Splunk within the guest OS.

The problem is that the custom performance data is saved under a different name than the guest OS, so I can't make a direct comparison if I just ingest the data--for instance, a guest OS might be named foo01456, but the ESX host knows it as bar01456, so Splunk would consider them two different machines. The machine name is not contained within the custom data, only in its filepath (e.g. \\dir1\dir2\bar01456\data.log). What are my options? It looks like editing inputs.conf to set the default host only will return me an exact match on a portion of the path, and I'd want to transform it from bar01456 to foo01456. Transforms.conf has a similar option, but I think the machine name has to be somewhere within the data for that to work. Do I understand this correctly? Is the name correlation better done at search time? Do I have to resort to a custom script?

Tags (2)
0 Karma

Splunk Employee
Splunk Employee

You can remap the host at indexing time by using props / transforms like this:


TRANSFORMS-hostname = esx_remap_host


# This corresponds to the full path to the logfile,
SOURCE_KEY = MetaData:Source
DEST_KEY = MetaData:Host
REGEX = /dir1/dir2/(.+)/data.log
FORMAT = host::$1

This, however, assumes a new hostname matched verbatim in the input filename. In your case, you might have to try something like:

REGEX = /dir1/dir2/prefix(\d+)/data.log
FORMAT = host::newprefix$1

I haven't tried this myself.

0 Karma
Register for .conf21 Now! Go Vegas or Go Virtual!

How will you .conf21? You decide! Go in-person in Las Vegas, 10/18-10/21, or go online with .conf21 Virtual, 10/19-10/20.