Splunk Search
Highlighted

How to edit my regular expression for a multivalue field extraction with new lines?

Path Finder

Hello,

I need REGEX help. I've wasted almost all day trying to do this and only came up with this which is very sloppy. I feel like this could be more efficient and work. When i plug it into Splunk it doesn't do anything in the field extractor "i'll define my own regular expression' section.

My Regex:

^Job Dependencies:\s*[([]*(\w+_\w+_\w+_\w+_\w+)[)\]]*|,\s+[([]*(\w+_\w+_\w+_\w+_\w+)[)\]]*,\n|\G\s*[([]*(\w+_\w+_\w+_\w+_\w+)[)\]]*,*

I only need the Job dependencies. I know i need to turn them into a multi value field so the expected splunk stats list output can look like this:

Job Name                             Job Dependencies
ABC_Job                              ABC_ABC_AB2_123_ABC123
                                     ABC_ABC_AB2_123_123ABC
                                     BCA_BCA_12A_ABC_123ABC
                                     DDD_AAA_CCC_12_123ABC

(I dont need help with the splunk search, just showing so you guys know what i'm trying to achieve.)

Since the Data also has a "Job Prerequisites:" section which have similarly formated data, my regex would capture this data as well, but i don't want it.

Please help. Sample data below:

Job Name :          Job ID:
ABC_Job              ADF123

Job Prerequisites: (ABC_ABC_AB2_123_ABC123, AB1_ABC_AB2_123_123ABC)

Job Dependencies: (ABC_ABC_AB2_123_ABC123, ABC_ABC_AB2_123_123ABC,
                  BCA_BCA_12A_ABC_123ABC, DDD_AAA_CCC_12_123ABC)

THERES A CATCH Sometimes the "Job Dependencies" could have square brackets OR just one dependency for example:

Job Dependencies: (ABC_ABC_AB2_123_ABC123, [ABC_ABC_AB2_123_123ABC],
                  BCA_BCA_12A_ABC_123ABC, DDD_AAA_CCC_12_123ABC)

OR

Job Dependencies: (DDD_AAA_CCC_12_123ABC)

Pretty much, i am trying to find the data with under scores (_) after Job Dependencies. I can't get my regex to wrap or work correctly.

Any help is greatly Appreciated.

Thanks,

John

0 Karma
Highlighted

Re: How to edit my regular expression for a multivalue field extraction with new lines?

Motivator

Ignoring all the pieces as required and focusing just on the troubling multivalued Job Dependencies here is what you can try to see if it works out for you.

Assuming one event has only one line of Job Dependencies: which is a multivalued field, how about trying to first rex out the multivalue field in a single field jd and then split it into multiple values in multiJD. Thereafter mvexpand shall give all the values:

your query to filter the events
| rex "your rex to get the job name"
| rex field=_raw "Job Dependencies:\s*\((?<jd>[^\)]+)"
| eval multiJD=split(jd, ",")
| mvexpand multiJD

View solution in original post

0 Karma
Highlighted

Re: How to edit my regular expression for a multivalue field extraction with new lines?

Esteemed Legend

Try this; it will create a multivalued field:

... | rex max_match=4 "(?ms)(?<Job_Dependency>[^\(\),\[\]\s]+)"
0 Karma
Highlighted

Re: How to edit my regular expression for a multivalue field extraction with new lines?

SplunkTrust
SplunkTrust

To expand on woodcock's code - here's a way to generate test data, and then a sample of his results and a slightly more complicated Rex that you can modify as you like to eliminate any text before the dependencies.

| makeresults
| eval MyDeps = mvappend(
 "Job Dependencies: (ABC_ABC_AB2_123_ABC123, [ABC_ABC_AB2_123_123ABC], BCA_BCA_12A_ABC_123ABC, DDD_AAA_CCC_12_123ABC)",
 "Job Dependencies: ([ABC_ABC_AB2_123_123ABC], BCA_BCA_12A_ABC_123ABC, [DDD_AAA_CCC_12_123ABC])",
 "Job Dependencies: (DDD_AAA_CCC_12_123ABC)",
 "Job Dependencies: ([DDD_AAA_CCC_12_123ABC])",
 "Job Dependencies: (ABC_ABC_AB2_123_ABC123, ABC_ABC_AB2_123_123ABC, BCA_BCA_12A_ABC_123ABC, DDD_AAA_CCC_12_123ABC)")
| mvexpand MyDeps
| rename MyDeps as _raw

everything above this point just makes some test data.

| rex max_match=10 "(?ms)(?<Job_Dep_Rex1>[^\(\),\[\]\s]+)"
| rex max_match=10 "(?ms)((?:Job Dependencies: )|(?<Job_Dep_Rex2>[^\(\),\[\]\s]+))"
0 Karma