Within my event data I have a file name for a data set that we move around between services.
Input files are sent in a zip file named "< env >.< app >.< client >< site >.< date >.zip". Where:
< env > is the environment as "test", "qa", or "prod"
< app > is the appcode in \w\w\w\d\d\d format
< client > is the 3-digit client number
< site > is the 2-digit site code
< date > is the 3-digit julian date plus "01" for AM, or "02" for PM
Example:
test.abc123.51720.02701.zip
I use the entirety of the file name as my 'source file' (SFIL) to track it through the three systems that touch/move it.
What I would like to do is ALSO track by
Suggestions?
Something like:
source="*.zip" |dedup source |rex field=source "^(P?<env>\w+)\.(P?<app>\w\w\w\d\d\d)\.(P?<client>\d{3})(P?<site>\d{2})" |stats count by env app client site
I've not tested this rex extraction, but it should be pretty close, and the concept is sound.
Once the fields are extracted you can pick and choose which app env client or site to sort on.
Instead of configuring the config files, you could create a macro that does the extraction.
You did not say you wanted it automatic.
This works and I appreciate pointing me to this but I have to figure out how to set it up in transforms.conf so it's visible without teaching the end users to use rex, etc.
I mean, instead of only doing SFIL=* I would like to do app=* or site=* or client=*.
For example, client=abc124 site=34 |stats count by hour
Take a look at this: http://answers.splunk.com/answers/119984/extracting-fields-from-an-existing-field
No. Your props.conf would contain something like this under the relevant sourcetype:
REPORT-subfields = your_subfields
and transforms.conf would have a matching stanza something like this:
[your_subfields]
SOURCE_KEY = ...
REGEX = ...
FORMAT = ...
Take a look at http://docs.splunk.com/Documentation/Splunk/latest/Knowledge/Createandmaintainsearch-timefieldextrac... for reference.
So, you are saying, in props I could do something similar to:
EXTRACT-subfields = SOURCE_KEY=SFIL (?
Exposing the extracted values to the end users without teaching them rex is the point of defining them in props.conf/transforms.conf.
Define your extraction as any REPORT-classname extraction, but use SOURCE_KEY to tell Splunk to read the source field rather than the default _raw.
I read the linked answer and it points the user towards the doc for transforms.conf. Specifically the SOURCE_KEY value. But, I'm afraid I need more information. The spec file says it can be used to meet my need but I don't understand how it is entered in the transforms.conf to expose the values to my end users without teaching them rex, etc.
I think you may need to explain more clearly - what are you currently getting stuck on? What do you mean by "track" here?