Splunk Search

How to configure props.conf to extract a field where the regular expression pattern is different based on the sourcetype?

briancronrath
Contributor

I'm having trouble finding a good solution for extracting a "pid" type value that exists in a uri structure but in different locations depending on the sourcetype. The transform performing the extraction depends on other transforms as well.

For instance, we have this all encompassing stanza in props.conf that extracts a bunch of fields for all "web" type logs:

[(?:::){0}*-web]
KV_MODE=none
REPORT-webservice-extractions = webservice-base-extract, webservice-base-extract-request, webservice-base-extract-uri
uri depends on request which depends on the base extract. Now this works fine for most of our web logs, but there are several where that webservice-base-extract-uri transform cannot match the regex on because the uri changes structure and thus the location of the pid changes. There's no fancy regex I can create that would be able to detect this because the only good way to find the pid is to go to a certain "level" deep in the uri. So I figure my only real option here is to create new stanzas for those specific sourcetypes, and use different transforms that basically duplicate all the fields from webservice-base-extract and webservice-base-extract-request and then use a different regex for the final uri extract. For example:

[source::/var/log/SomeSourceTypeA-web.log]
KV_MODE=none
REPORT-webservice-extractions-SomeSourceTypeA = SomeSourceTypeA-base-extract, SomeSourceTypeA-base-extract-request, SomeSourceTypeA-base-extract-uri

[source::/var/log/SomeSourceTypeB-web.log]
KV_MODE=none
REPORT-webservice-extractions-SomeSourceTypeB = SomeSourceTypeB-base-extract, SomeSourceTypeB-base-extract-request, SomeSourceTypeB-base-extract-uri
And so on and so on for each of these logs where the the uri structure is different. The issue I have with this is I am going to have to duplicate so many fields just to get to that final uri extract because to my knowledge I wouldn't want to be using the same fields on different stanzas. For example:

[source::/var/log/SomeSourceTypeB-web.log]
KV_MODE=none
REPORT-webservice-extractions-SomeSourceTypeB = webservice-base-extract, webservice-base-extract-request, SomeSourceTypeB-base-extract-uri
The thing is ideally the above stanza is what I would like to do in essence since I would be using the same fields and not duplicating, and only having the difference being the logic used for that final uri extract.

I don't suppose anyone has any suggestions on the "cleanest" way I could accomplish this? Hopefully I've explained the situation properly, let me know if I can clarify anything. We are on Splunk 6.4.0.

0 Karma
1 Solution

briancronrath
Contributor

Sorry everyone, turns out I was overcomplicating things. As long as I had all the same base extractions in my one-off stanzas, it actually worked just fine and the field names being the same turned out to not be an issue! So that final stanza works just fine that I posted in my original question.

View solution in original post

0 Karma

briancronrath
Contributor

Sorry everyone, turns out I was overcomplicating things. As long as I had all the same base extractions in my one-off stanzas, it actually worked just fine and the field names being the same turned out to not be an issue! So that final stanza works just fine that I posted in my original question.

View solution in original post

0 Karma

aaraneta_splunk
Splunk Employee
Splunk Employee

Hi briancronrath - Just making sure your answer provided is the solution to your question? If yes, please don't forget to click "Accept" so others will know it's resolved 🙂 Thanks!

0 Karma

somesoni2
Revered Legend

For one sourcetype/source, do you've pid in different places? If in one sourcetype/source, the location/pattern/regex for pid is same, you just to need to create one REPORT with same field name (may be different regex) for each sourcetype/source. Also, there might be a way to create a single regex to accommodate all possible scenarios of pid location, but will depend upon the logs. Could you share a sample entry for each of the variations where pid can exists?

0 Karma
.conf21 CFS Extended through 5/20!

Don't miss your chance
to share your Splunk
wisdom in-person or
virtually at .conf21!

Call for Speakers has
been extended through
Thursday, 5/20!