Splunk Search

Fields within fields - search time extract

tyronetv
Communicator

Within my event data I have a file name for a data set that we move around between services.

Input files are sent in a zip file named "< env >.< app >.< client >< site >.< date >.zip". Where:

< env > is the environment as "test", "qa", or "prod"

< app > is the appcode in \w\w\w\d\d\d format

< client > is the 3-digit client number

< site > is the 2-digit site code

< date > is the 3-digit julian date plus "01" for AM, or "02" for PM

Example:

test.abc123.51720.02701.zip

I use the entirety of the file name as my 'source file' (SFIL) to track it through the three systems that touch/move it.

What I would like to do is ALSO track by or which are parts of the previously define source file (SFIL).

Suggestions?

Tags (1)
0 Karma
1 Solution

lukejadamec
Super Champion

Something like:

source="*.zip" |dedup source |rex field=source "^(P?<env>\w+)\.(P?<app>\w\w\w\d\d\d)\.(P?<client>\d{3})(P?<site>\d{2})" |stats count by env app client site

I've not tested this rex extraction, but it should be pretty close, and the concept is sound.

Once the fields are extracted you can pick and choose which app env client or site to sort on.

lukejadamec
Super Champion

Instead of configuring the config files, you could create a macro that does the extraction.

0 Karma

lukejadamec
Super Champion

You did not say you wanted it automatic.

0 Karma

tyronetv
Communicator

This works and I appreciate pointing me to this but I have to figure out how to set it up in transforms.conf so it's visible without teaching the end users to use rex, etc.

0 Karma

tyronetv
Communicator

I mean, instead of only doing SFIL=* I would like to do app=* or site=* or client=*.

For example, client=abc124 site=34 |stats count by hour

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

No. Your props.conf would contain something like this under the relevant sourcetype:

REPORT-subfields = your_subfields

and transforms.conf would have a matching stanza something like this:

[your_subfields]
SOURCE_KEY = ...
REGEX = ...
FORMAT = ...

Take a look at http://docs.splunk.com/Documentation/Splunk/latest/Knowledge/Createandmaintainsearch-timefieldextrac... for reference.

0 Karma

tyronetv
Communicator

So, you are saying, in props I could do something similar to:

EXTRACT-subfields = SOURCE_KEY=SFIL (?\w+).(?P\w{3}\d{3}).(?\d{3})(?P\d{2})

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Exposing the extracted values to the end users without teaching them rex is the point of defining them in props.conf/transforms.conf.

Define your extraction as any REPORT-classname extraction, but use SOURCE_KEY to tell Splunk to read the source field rather than the default _raw.

tyronetv
Communicator

I read the linked answer and it points the user towards the doc for transforms.conf. Specifically the SOURCE_KEY value. But, I'm afraid I need more information. The spec file says it can be used to meet my need but I don't understand how it is entered in the transforms.conf to expose the values to my end users without teaching them rex, etc.

0 Karma

Ayn
Legend

I think you may need to explain more clearly - what are you currently getting stuck on? What do you mean by "track" here?

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...