All Apps and Add-ons

Field Extraction Regex - optional fields

atornes
Path Finder

I'm terrible with Regex's and can't figure out what is wrong with mine...

I have this Regex:

(?P<date>[^,\n]+),\d*--(?P<account>[^,]+),(?P<provider>[^,]+),(?P<product>[^,]+),(?P<service_username>[^,]*),(?P<billing_status>[^,]+),(?P<activities>[^,]*),(?P<requests>[^,]*),(?P<days>[^,]*),(?P<ruuid>[^,]*)

Currently working fine for events like:

2014-04-03,1034--account,provider,product,username,Billable,10073107,,19,e5xwcmhbrk,

We've added another field to the end that is optional for some events (80--parent below):

2014-04-03,1307--account,provider,product,username,Billable,1614,0,,,80--parent

I can't figure out how to extract that "parent" value when it exists (I don't want the preceding "80--". It should work exactly as the 1304--account works in the currently functioning regex. When I try to essentially copy that part of the functioning regex, my queries only return the events that have that parent field name

0 Karma
1 Solution

somesoni2
SplunkTrust
SplunkTrust

Try this.

"^(?P<date>[^,\n]+),\d+--(?P<account>[^,]+),(?P<provider>[^,]+),(?P<product>[^,]+),(?P<service_username>[^,]*),(?P<billing_status>[^,]+),(?P<activities>[^,]*),(?P<requests>[^,]*),(?P<days>[^,]*),(?P<ruuid>[^,]*)[,\d-]*(?P<parent>.*)"

Updated answer

This will be scalable. For every new field to be added at the end of event, your can add "[,\d-]*(?P<newFieldName>[^,]*)" to the regex.

"^(?P<date>[^,\n]+),\d+--(?P<account>[^,]+),(?P<provider>[^,]+),(?P<product>[^,]+),(?P<service_username>[^,]*),(?P<billing_status>[^,]+),(?P<activities>[^,]*),(?P<requests>[^,]*),(?P<days>[^,]*),(?P<ruuid>[^,]*)[,\d-]*(?P<parent>[^,]*)"

View solution in original post

somesoni2
SplunkTrust
SplunkTrust

Try this.

"^(?P<date>[^,\n]+),\d+--(?P<account>[^,]+),(?P<provider>[^,]+),(?P<product>[^,]+),(?P<service_username>[^,]*),(?P<billing_status>[^,]+),(?P<activities>[^,]*),(?P<requests>[^,]*),(?P<days>[^,]*),(?P<ruuid>[^,]*)[,\d-]*(?P<parent>.*)"

Updated answer

This will be scalable. For every new field to be added at the end of event, your can add "[,\d-]*(?P<newFieldName>[^,]*)" to the regex.

"^(?P<date>[^,\n]+),\d+--(?P<account>[^,]+),(?P<provider>[^,]+),(?P<product>[^,]+),(?P<service_username>[^,]*),(?P<billing_status>[^,]+),(?P<activities>[^,]*),(?P<requests>[^,]*),(?P<days>[^,]*),(?P<ruuid>[^,]*)[,\d-]*(?P<parent>[^,]*)"

carasso
Splunk Employee
Splunk Employee

The correct regex is:

"(?P[^,n]+),\d*--(?P[^,]+),(?P[^,]+),(?P[^,]+),(?P[^,]),(?P[^,]+),(?P[^,]\
*),(?P[^,]
),(?P[^,]),(?P[^,])(?:,\d*--(?P.*))?"

The entire extra argument is encased in one "(?: ... )?". The other regexes suggested above allow invalid things, such as just "5-5,5-5,parent"

0 Karma

somesoni2
SplunkTrust
SplunkTrust

For the events where this field is not present,comma is also not present so its options here. See the updated answer for scalable regex.

0 Karma

atornes
Path Finder

Works. Will this easily allow me to add other fields? Seems weird that its not comma-delimited like everything else

0 Karma
Get Updates on the Splunk Community!

Observability | How to Think About Instrumentation Overhead (White Paper)

Novice observability practitioners are often overly obsessed with performance. They might approach ...

Cloud Platform | Get Resiliency in the Cloud Event (Register Now!)

IDC Report: Enterprises Gain Higher Efficiency and Resiliency With Migration to Cloud  Today many enterprises ...

The Great Resilience Quest: 10th Leaderboard Update

The tenth leaderboard update (11.23-12.05) for The Great Resilience Quest is out &gt;&gt; As our brave ...