Getting Data In

dynamic sourcetype extraction problems

Ultracpp
Engager

Hi all,
I am trying to setup dynamic sourcetype extraction, but no luck.

sample message has json:
{"id":"someid","type":"action"}

This is my config:

inputs.conf:


[tcp://9001]
connection_host = none
source=platform

props.conf:


[source::platform]
TRANSFORMS-sourcetype = platform-st

transofrms.conf:


[platform-st]
SOURCE_KEY = source
DEST_KEY = MetaData:Sourcetype
REGEX = \"type\":\"([^\"]+)\"
FORMAT = sourcetype::$1

Thank you

Tags (1)
1 Solution

hexx
Splunk Employee
Splunk Employee

I believe that the problem lies with this configuration parameter :

"SOURCE_KEY = source".

From transforms.conf.spec :

SOURCE_KEY = <string>
* NOTE: This attribute is valid for both index-time and search-time field extractions.
* Optional. Defines the KEY that Splunk applies the REGEX to.
* For search time extractions, you can use this attribute to extract one or more values from
the values of another field. You can use any field that is available at the time of the
execution of this field extraction.
* For index-time extractions use the KEYs described at the bottom of this file.
* KEYs are case-sensitive, and should be used exactly as they appear in the KEYs list at
the bottom of this file. (For example, you would say SOURCE_KEY = MetaData:Host, not
SOURCE_KEY = metadata:host .)
* SOURCE_KEY is typically used in conjunction with REPEAT_MATCH in index-time field
transforms.
* Defaults to _raw, which means it is applied to the raw, unprocessed text of all events.

The string "source" is an invalid value for SOURCE_KEY. I am assuming that your goal is to extract the value to assign to the "sourcetype" from the body of your events.

In that case, you should remove the "SOURCE_KEY = source" parameter altogether, which will result in Splunk applying your REGEX to the body of the event (the "_raw" field).

View solution in original post

hexx
Splunk Employee
Splunk Employee

I believe that the problem lies with this configuration parameter :

"SOURCE_KEY = source".

From transforms.conf.spec :

SOURCE_KEY = <string>
* NOTE: This attribute is valid for both index-time and search-time field extractions.
* Optional. Defines the KEY that Splunk applies the REGEX to.
* For search time extractions, you can use this attribute to extract one or more values from
the values of another field. You can use any field that is available at the time of the
execution of this field extraction.
* For index-time extractions use the KEYs described at the bottom of this file.
* KEYs are case-sensitive, and should be used exactly as they appear in the KEYs list at
the bottom of this file. (For example, you would say SOURCE_KEY = MetaData:Host, not
SOURCE_KEY = metadata:host .)
* SOURCE_KEY is typically used in conjunction with REPEAT_MATCH in index-time field
transforms.
* Defaults to _raw, which means it is applied to the raw, unprocessed text of all events.

The string "source" is an invalid value for SOURCE_KEY. I am assuming that your goal is to extract the value to assign to the "sourcetype" from the body of your events.

In that case, you should remove the "SOURCE_KEY = source" parameter altogether, which will result in Splunk applying your REGEX to the body of the event (the "_raw" field).

gkanapathy
Splunk Employee
Splunk Employee

You should not specify SOURCE_KEY = source. Presumably, you want to run the regex against the raw data, not the source field.

Get Updates on the Splunk Community!

Join Us for Splunk University and Get Your Bootcamp Game On!

If you know, you know! Splunk University is the vibe this summer so register today for bootcamps galore ...

.conf24 | Learning Tracks for Security, Observability, Platform, and Developers!

.conf24 is taking place at The Venetian in Las Vegas from June 11 - 14. Continue reading to learn about the ...

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...