Splunk Search

Fields within fields - search time extract

tyronetv
Communicator

Within my event data I have a file name for a data set that we move around between services.

Input files are sent in a zip file named "< env >.< app >.< client >< site >.< date >.zip". Where:

< env > is the environment as "test", "qa", or "prod"

< app > is the appcode in \w\w\w\d\d\d format

< client > is the 3-digit client number

< site > is the 2-digit site code

< date > is the 3-digit julian date plus "01" for AM, or "02" for PM

Example:

test.abc123.51720.02701.zip

I use the entirety of the file name as my 'source file' (SFIL) to track it through the three systems that touch/move it.

What I would like to do is ALSO track by or which are parts of the previously define source file (SFIL).

Suggestions?

Tags (1)
0 Karma
1 Solution

lukejadamec
Super Champion

Something like:

source="*.zip" |dedup source |rex field=source "^(P?<env>\w+)\.(P?<app>\w\w\w\d\d\d)\.(P?<client>\d{3})(P?<site>\d{2})" |stats count by env app client site

I've not tested this rex extraction, but it should be pretty close, and the concept is sound.

Once the fields are extracted you can pick and choose which app env client or site to sort on.

lukejadamec
Super Champion

Instead of configuring the config files, you could create a macro that does the extraction.

0 Karma

lukejadamec
Super Champion

You did not say you wanted it automatic.

0 Karma

tyronetv
Communicator

This works and I appreciate pointing me to this but I have to figure out how to set it up in transforms.conf so it's visible without teaching the end users to use rex, etc.

0 Karma

tyronetv
Communicator

I mean, instead of only doing SFIL=* I would like to do app=* or site=* or client=*.

For example, client=abc124 site=34 |stats count by hour

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

No. Your props.conf would contain something like this under the relevant sourcetype:

REPORT-subfields = your_subfields

and transforms.conf would have a matching stanza something like this:

[your_subfields]
SOURCE_KEY = ...
REGEX = ...
FORMAT = ...

Take a look at http://docs.splunk.com/Documentation/Splunk/latest/Knowledge/Createandmaintainsearch-timefieldextrac... for reference.

0 Karma

tyronetv
Communicator

So, you are saying, in props I could do something similar to:

EXTRACT-subfields = SOURCE_KEY=SFIL (?\w+).(?P\w{3}\d{3}).(?\d{3})(?P\d{2})

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Exposing the extracted values to the end users without teaching them rex is the point of defining them in props.conf/transforms.conf.

Define your extraction as any REPORT-classname extraction, but use SOURCE_KEY to tell Splunk to read the source field rather than the default _raw.

tyronetv
Communicator

I read the linked answer and it points the user towards the doc for transforms.conf. Specifically the SOURCE_KEY value. But, I'm afraid I need more information. The spec file says it can be used to meet my need but I don't understand how it is entered in the transforms.conf to expose the values to my end users without teaching them rex, etc.

0 Karma

Ayn
Legend

I think you may need to explain more clearly - what are you currently getting stuck on? What do you mean by "track" here?

0 Karma
Get Updates on the Splunk Community!

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars this month. This ...

They're back! Join the SplunkTrust and MVP at .conf24

With our highly anticipated annual conference, .conf, comes the fez-wearers you can trust! The SplunkTrust, as ...

Enterprise Security Content Update (ESCU) | New Releases

Last month, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...