Splunk Search

Fields within fields - search time extract

tyronetv
Communicator

Within my event data I have a file name for a data set that we move around between services.

Input files are sent in a zip file named "< env >.< app >.< client >< site >.< date >.zip". Where:

< env > is the environment as "test", "qa", or "prod"

< app > is the appcode in \w\w\w\d\d\d format

< client > is the 3-digit client number

< site > is the 2-digit site code

< date > is the 3-digit julian date plus "01" for AM, or "02" for PM

Example:

test.abc123.51720.02701.zip

I use the entirety of the file name as my 'source file' (SFIL) to track it through the three systems that touch/move it.

What I would like to do is ALSO track by or which are parts of the previously define source file (SFIL).

Suggestions?

Tags (1)
0 Karma
1 Solution

lukejadamec
Super Champion

Something like:

source="*.zip" |dedup source |rex field=source "^(P?<env>\w+)\.(P?<app>\w\w\w\d\d\d)\.(P?<client>\d{3})(P?<site>\d{2})" |stats count by env app client site

I've not tested this rex extraction, but it should be pretty close, and the concept is sound.

Once the fields are extracted you can pick and choose which app env client or site to sort on.

lukejadamec
Super Champion

Instead of configuring the config files, you could create a macro that does the extraction.

0 Karma

lukejadamec
Super Champion

You did not say you wanted it automatic.

0 Karma

tyronetv
Communicator

This works and I appreciate pointing me to this but I have to figure out how to set it up in transforms.conf so it's visible without teaching the end users to use rex, etc.

0 Karma

tyronetv
Communicator

I mean, instead of only doing SFIL=* I would like to do app=* or site=* or client=*.

For example, client=abc124 site=34 |stats count by hour

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

No. Your props.conf would contain something like this under the relevant sourcetype:

REPORT-subfields = your_subfields

and transforms.conf would have a matching stanza something like this:

[your_subfields]
SOURCE_KEY = ...
REGEX = ...
FORMAT = ...

Take a look at http://docs.splunk.com/Documentation/Splunk/latest/Knowledge/Createandmaintainsearch-timefieldextrac... for reference.

0 Karma

tyronetv
Communicator

So, you are saying, in props I could do something similar to:

EXTRACT-subfields = SOURCE_KEY=SFIL (?\w+).(?P\w{3}\d{3}).(?\d{3})(?P\d{2})

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Exposing the extracted values to the end users without teaching them rex is the point of defining them in props.conf/transforms.conf.

Define your extraction as any REPORT-classname extraction, but use SOURCE_KEY to tell Splunk to read the source field rather than the default _raw.

tyronetv
Communicator

I read the linked answer and it points the user towards the doc for transforms.conf. Specifically the SOURCE_KEY value. But, I'm afraid I need more information. The spec file says it can be used to meet my need but I don't understand how it is entered in the transforms.conf to expose the values to my end users without teaching them rex, etc.

0 Karma

Ayn
Legend

I think you may need to explain more clearly - what are you currently getting stuck on? What do you mean by "track" here?

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...