topic Re: Field Extraction Mystery in Splunk Search

Field Extraction Mystery

Ant1D — Tue, 07 Sep 2010 23:35:58 GMT

Hey,

I would like to use field extraction at search time to do the following:

My source field in Splunk contains file paths. Each file path has a word that I want to extract from it and place into another field.

E.g. Source field contains a file path \helloworld\welcome\TheWord-uvwxyz.1234.log The word I want to extract is uvwxyz.

How can I achieve this? Is there a way of doing this using props.conf and/or transforms.conf?

N.B. I do not want to extract data from _raw but from the field named source.

Thanks in advance for your help

Re: Field Extraction Mystery

Brian_Osburn — Tue, 07 Sep 2010 23:42:14 GMT

During your search, you can do something like this:

.. | eval extracted=ltrim(source,"\helloworld\welcome\TheWord-") | eval extracted=rtrim(extracted,".1234.log")

Re: Field Extraction Mystery

Ant1D — Tue, 07 Sep 2010 23:58:27 GMT

Would this allow me to populate another field with the extracted words? If so, would I have to keep running this search each time I want to populate another field with this data?

Re: Field Extraction Mystery

hbazan — Wed, 08 Sep 2010 00:03:36 GMT

In My case i use this:

... | rex field=source "basefolder\\\\(?<path>(\w+\\\\)+)(?<filename>.*).log" |

Obtaining both the filepath and the filename. For your example I'd do:

... | rex field=source "helloworld\\\\(?<path>(\w+\\\\)+)TheWord-(?<filename>.*).log" |

Re: Field Extraction Mystery

Brian_Osburn — Wed, 08 Sep 2010 00:24:24 GMT

it would extract the field "extracted" with what ever it matched.

Re: Field Extraction Mystery

southeringtonp — Wed, 20 Oct 2010 02:57:45 GMT

This will be easier to deal with if you define a permanent extraction.

In transforms.conf:

[extract-filename]
SOURCE_KEY = source
REGEX = TheWord-([^\.]+)
FORMAT = filename::$1

In props.conf:

[yoursourcetype]
REPORT-filename = extract-filename

Tweak the regex to your liking. Change the [yoursourcetype] heading to [host::yourhost] or [source::yoursource] as needed.

The fact that you are extracting from source is something of a special case, since you can be sure of having that field already populated in the index.

If your first field is not host, source, or sourcetype, then you also need to make sure that your field extractions are called in the correct order -- naming becomes important. For example REPORT-000-fullpath and REPORT-999-filename.

Re: Field Extraction Mystery

Ant1D — Wed, 20 Oct 2010 15:41:39 GMT

Thanks for the info southeringtonp. I will give this a test and let you know the results.