Getting Data In

Getting hostname from combined logfile

New Member

Our logs are combined on our logserver with scribe and they look like:

[web1] Time: 120807  0:08:21
[web1] Something something
[web1] Something else
[web4] Time: 120807  0:08:25

How can I strip the [web1] from each line and use that as the hostname in Splunk?


Because host is one of the few indexed fields, rather than search-time fields, you will have to do things a little differently. You should not use the Interactive Field Extractor or any other search-time method for creating the host field.

In props.conf, put

TIME_FORMAT = %y%m%d %H:%M:%S
TRANSFORMS-my-host = extract-my-host

In transforms.conf, put

DEST_KEY = MetaData:Host
REGEX = ^\[(\S+?)]
FORMAT = host::$1

I don't know what sourcetype you gave this data, but you will need to substitute that for yoursourcetype in props.conf. I also threw in a few more settings that will speed up Splunk's parsing of the input stream, and make sure the timestamp is properly interpreted. I assume that this log contains only single-line events.

Also, the REGEX assumes that the host name always appears at the beginning of each line, enclosed in square brackets.

Let us know if that doesn't work.

Splunk Employee
You can tell Splunk to extract the field with the interactive field extractor, doc here:

To combine the events, you'll have to use the transaction function within Splunk during a search. Here:

is a Splunk blog describing how to achieve this. Let us know if this is not what you are looking for.

