dynamic sourcetype extraction problems


Hi all,
I am trying to setup dynamic sourcetype extraction, but no luck.

sample message has json:

This is my config:


connection_host = none


TRANSFORMS-sourcetype = platform-st


SOURCE_KEY = source
DEST_KEY = MetaData:Sourcetype
REGEX = \"type\":\"([^\"]+)\"
FORMAT = sourcetype::$1

Thank you

Re: dynamic sourcetype extraction problems

Splunk Employee
Splunk Employee

You should not specify SOURCE_KEY = source. Presumably, you want to run the regex against the raw data, not the source field.


Re: dynamic sourcetype extraction problems

Splunk Employee
Splunk Employee

I believe that the problem lies with this configuration parameter :

"SOURCE_KEY = source".

From transforms.conf.spec :

SOURCE_KEY = <string>
* NOTE: This attribute is valid for both index-time and search-time field extractions.
* Optional. Defines the KEY that Splunk applies the REGEX to.
* For search time extractions, you can use this attribute to extract one or more values from
the values of another field. You can use any field that is available at the time of the
execution of this field extraction.
* For index-time extractions use the KEYs described at the bottom of this file.
* KEYs are case-sensitive, and should be used exactly as they appear in the KEYs list at
the bottom of this file. (For example, you would say SOURCE_KEY = MetaData:Host, not
SOURCE_KEY = metadata:host .)
* SOURCE_KEY is typically used in conjunction with REPEAT_MATCH in index-time field
* Defaults to _raw, which means it is applied to the raw, unprocessed text of all events.

The string "source" is an invalid value for SOURCE_KEY. I am assuming that your goal is to extract the value to assign to the "sourcetype" from the body of your events.

In that case, you should remove the "SOURCE_KEY = source" parameter altogether, which will result in Splunk applying your REGEX to the body of the event (the "_raw" field).

