Splunk Search

Field extraction from source plus custom sourcetype

mikeklare
New Member

Hello,

I am using Free Version. I would like to use field extraction at (search time or run-time it does not matter) to extract fields from source and put them in other fields. The source is a tar.gz, but I am also customizing it to multiple sourcetypes which I have not settled on the final design. e.g., source /tmp/data/foo-v6.1-diff.tar.gz:./cgi-bin/whatever/foo/my.php. This gets loaded via ./splunk add oneshot /tmp/data/foo-v6.1-diff.tar.gz so there are no monitors.

I have seen at least 5-7 posts in Splunk Answers and I cannot get it to work but it seems like everyone of these approaches should work. In each case, I completely start from scratch so there isn't a precedence issue from that. The source type is always set correctly, I just can't use the fields in any searches.

Attempt 1 using TRANSFORMS props.conf

[source::....php$(.\d+)?]
TRANSFORM-setrnt = trans1, trans2

example of where there would be multiple sources here

[source::....phph$(.\d+)?]
sourcetype = phph

transforms.conf

[trans1]
SOURCE_KEY = MetaData:Source
REGEX = ^/([a-zA-Z0-9\-\/]*)\/(?<projsite>[a-zA-Z0-9]*)\-v(?<projver>[0-9].[0-9])\-
[trans2]
DEST_KEY = MetaData:Sourcetype
REGEX = (.)
FORMAT = sourcetype::php

fields.conf

[projsite]
INDEXED = true
[projver]
INDEXED = true

I have also used variations here of:

WRITE_META=true
REGEX = ^/([a-zA-Z0-9\-\/]*)\/(?[a-zA-Z0-9]*)\-v([0-9].[0-9])\-
FORMAT = projsite::$2 projver::$3

Attempt 2 using EXTRACT props.conf

[source::....php$(.\d+)?]
sourcetype = source-php
EXTRACT-sourcefields = ^/([a-zA-Z0-9\-\/]*)\/(?<projsite>[a-zA-Z0-9]*)\-v(?<projver>[0-9].[0-9])\-

Attempt 3 using REPORTS props.conf

[source::....php$(.\d+)?]
sourcetype = source-php
[source-php]
REPORTS-filename = extract-filename

transforms.conf

[extract-filename]
SOURCE_KEY = MetaData:Source
REGEX = "^/([a-zA-Z0-9\-\/]*)\/(?<projsite>[a-zA-Z0-9]*)\-v(?<projver>[0-9].[0-9])\-"

As a side note, I have verified the REGEX syntax independently with pcregextest and by typing it directly in the searchbox.

Tags (1)
0 Karma

hazekamp
Builder

I would recommend doing this @ search time via:

## props.conf
[source::....php$(.\d+)?]
REPORT-proj_extract = proj_extract

## transforms.conf
[proj_extract]
SOURCE_KEY = source
REGEX = ^/([a-zA-Z0-9\-\/]*)\/([a-zA-Z0-9]*)\-v([0-9].[0-9])\-
FORMAT = projsite::$2 projver::$3

In summary it looks like you had a combination of errors in your attempts above:

  1. TRANSFORMS is TRANSFORMS not TRANSFORM
  2. REPORT is REPORT not REPORTS
  3. EXTRACT operates on _raw by default
  4. for REPORT- transforms.conf SOURCE_KEY does not need "MetaData:"

Hope this helps!

0 Karma

mikeklare
New Member

My lack of attention to detail is stunning. Thanks, it worked!

I had to put splunk in –debug mode. Using the TRANSFORMS approach information printed category PropertiesMapConfig and regexExtractionProcessor. I still did not get indexed fields. Using those fixes, the REPORT approach worked. Logging should be increased in this area. There is no logging showing the regex running for the REPORT and with TRANSFORMS the regex succeeded but the indexes did not show up. Also escaping for windows path inconsistent. macros.conf/search rex need extra escape, pcregextest/transforms.conf do not.

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Why Splunk Customers Should Attend Cisco Live 2026 Las Vegas

Why Splunk Customers Should Attend Cisco Live 2026 Las Vegas     Cisco Live 2026 is almost here, and this ...

What Is the Name of the USB Key Inserted by Bob Smith? (BOTS Hint, Not the Answer)

Hello Splunkers,   So you searched, “what is the name of the usb key inserted by bob smith?”  Not gonna lie… ...

Automating Threat Operations and Threat Hunting with Recorded Future

    Automating Threat Operations and Threat Hunting with Recorded Future June 29, 2026 | Register   Is your ...