Solved: Can someone help me adjust my regex to only captur...

michaeler · ‎05-23-2023

I can't use the field extractor because the field configurations are frequently very different and it gives me errors so I've been using "| rex" instead.

Can someone help me adjust my regex to only capture "P3820 Houston to A345 Atlanta Line Down" for the field "Details" every time?

| rex field= "(?<Details>.*)\s-\s\d{4}[Z]\s\d{2}\s[a-zA-Z]{3}\s-\s(\d{4}Z\s\d{2}\s[a-zA-Z]{3}|On)"

field examples:
P3820 Houston to A345 Atlanta Line Down - 1339Z 19 May - On-going - TKT39390423

P3820 Houston to A345 Atlanta Line Down - 1339Z 19 May - 0834Z 20 May - TKT39390423

P3820 Houston to A345 Atlanta Line Down - 1339Z 19 MAY - Ongoing - TKT39390423 - 1339Z 19 May - On-going - TKT39390423

P3820 Houston - A345 Atlanta Line Down - 1339Z 19 MAY - Ongoing - INC39390423, DIRJ LLO MM#:394039 - 1339Z 19 May - On-going - TKT39390423

P3820 Houston - A345 Atlanta Line Down - 1339Z 19 MAY - 1834Z MAY - INC39390423, DIRJ LLO MM#:394039 - 1339Z 19 May - 0834Z 20 May - TKT39390423

I don't have any issue for the first two but when the date/time range is repeated I end up with everything before the second "1339Z 19 May" included in the "Details" field

danspav · ‎06-04-2023

Hi @michaeler,

Here's a regex to extract everything up to the first " - 1339Z" (any numbers will match)

| rex field=rows "(?<Details>.+?)\s-\s\d{4}Z"

Here's a query to test it out:

| makeresults
| eval rows="P3820 Houston to A345 Atlanta Line Down - 1339Z 19 May - On-going - TKT39390423@P3820 Houston to A345 Atlanta Line Down - 1339Z 19 May - 0834Z 20 May - TKT39390423@P3820 Houston to A345 Atlanta Line Down - 1339Z 19 MAY - Ongoing - TKT39390423 - 1339Z 19 May - On-going - TKT39390423@P3820 Houston - A345 Atlanta Line Down - 1339Z 19 MAY - Ongoing - INC39390423, DIRJ LLO MM#:394039 - 1339Z 19 May - On-going - TKT39390423@P3820 Houston - A345 Atlanta Line Down - 1339Z 19 MAY - 1834Z MAY - INC39390423, DIRJ LLO MM#:394039 - 1339Z 19 May - 0834Z 20 May - TKT39390423"
| makemv rows delim="@"
| mvexpand rows
| table rows
| rex field=rows "(?<Details>.+?)\s-\s\d{4}Z"

Cheers,
Daniel

View solution in original post

danspav · ‎06-04-2023

Hi @michaeler,

Here's a regex to extract everything up to the first " - 1339Z" (any numbers will match)

| rex field=rows "(?<Details>.+?)\s-\s\d{4}Z"

Here's a query to test it out:

| makeresults
| eval rows="P3820 Houston to A345 Atlanta Line Down - 1339Z 19 May - On-going - TKT39390423@P3820 Houston to A345 Atlanta Line Down - 1339Z 19 May - 0834Z 20 May - TKT39390423@P3820 Houston to A345 Atlanta Line Down - 1339Z 19 MAY - Ongoing - TKT39390423 - 1339Z 19 May - On-going - TKT39390423@P3820 Houston - A345 Atlanta Line Down - 1339Z 19 MAY - Ongoing - INC39390423, DIRJ LLO MM#:394039 - 1339Z 19 May - On-going - TKT39390423@P3820 Houston - A345 Atlanta Line Down - 1339Z 19 MAY - 1834Z MAY - INC39390423, DIRJ LLO MM#:394039 - 1339Z 19 May - 0834Z 20 May - TKT39390423"
| makemv rows delim="@"
| mvexpand rows
| table rows
| rex field=rows "(?<Details>.+?)\s-\s\d{4}Z"

Cheers,
Daniel

isoutamo · ‎06-05-2023

Hi

a good tool to create a regex is regex101.com. You could create regex here and see immediately how it works. If there is something which you cannot solve by yourself, you could save it and share that link to another people to help you. https://regex101.com/r/H9vuAk/1 here is your sample and how it was handled with PCRE2 engine. As you see it match more than splunk rex as default max_match=1. In splunk this is working as normally rex match only first one. But time by time you need to use max_match=0 and then it didn't work. But if you add ^ into first character then it work and actually it's little bit efficient than without it (https://regex101.com/r/fD0J9e/1).

r. Ismo

Can someone help me adjust my regex to only capture "P3820 Houston to A345 Atlanta Line Down" for the field "Details"?

field extraction

regex

rex

Get Inspired! We’ve Got Validation that Your Hard Work is Paying Off

What's New in Splunk Enterprise 9.4: Features to Power Your Digital Resilience

Take Your Breath Away with Splunk Risk-Based Alerting (RBA)