Splunk Search

How to write conditional regex to extract field A, but if field A does not exist, then extract field B?

philallen1
Path Finder

Hi

I have a problem in Splunk's regex and I can't figure it out for the life of me.

I'm going to simplify my problem a bit. Imagine this is my data:

|a|b|

If 'a' exists, I want my regex to pick out 'a' only, otherwise I want it to pick out 'b' only.

So in this case: |a|b| my regex should pick out 'a'. But, if for some reason 'a' didn't exist in my data, I want the same bit of regex to pick out 'b'.

|a|b| ---> should pick out 'a' only

|c|b| ---> should pick out 'b' only

'b' always exists, but 'a' sometimes doesn't.

Any ideas?

Thanks a lot!

Phil

0 Karma

woodcock
Esteemed Legend

Try this (I may be taking you too literally, but the pieces that you need are here):

(?J)(?:^|[\r\n])\|(?:(?<MyField>a)\|b\|)|(?:c\|(?<MyField>b)\|)

https://regex101.com/r/5Peq78/1

0 Karma

to4kawa
Ultra Champion
| makeresults 
| eval _raw="2014-08-29 16:09:00,830 INFO Order filled for 8=FIX.4.4|9=667|15=GBP|55=USD
2014-08-27 16:11:19,373 INFO Order filled for 8=FIX.4.4|9=667|55=USD" 
| makemv delim="
" _raw 
| stats count by _raw 
`comment("this is sample log")`
`comment("the logic")`
| rex "^(?<_time>[\w-]+ \d\d:\d\d:\d\d,\d{3}) .+ for (?<message>.+)" 
| eval _time=strptime(_time,"%F %T")
| eval message=split(message,"|")
| stats count by _time message
| rex field=message "(?<key>\w+)=(?<value>.+$)"
| eval {key} = value
| fields - message count key value
| stats values(*) as * by _time

This is @philallen1 's sample.
I think it is unnecessary to use conditional REGEX .

0 Karma

nick405060
Motivator

Here. _ fields are a little tricky so I would eval/rename them like I did here.

index=myindex | 

eval no_referrer_regex="MYREGEX1" |

eval referrer_regex="MYREGEX2" |

eval regex=if(_time < 1579250700,no_referrer_regex,referrer_regex) | eval raw=_raw |

map maxsearches=10000 search="| makeresults | eval mapped_raw=\"$$raw$$\" | rex field=mapped_raw \"$$regex$$\"" | table pst pst_epoch id action path num desc browser referrer

A second approach would just be to use ad-hoc searches in SimpleXML to set token values.

0 Karma

Runals
Motivator

I would try something like

... | rex "| \d+=(?[A-Z]+)"

In theory this will see the pipe followed by a number, equal sign, and then upper case letters and only do that for the first time it runs into that pattern. I forget if all currency has a 3 letter designate but you could potentially change the [A-Z]+ part to [A-Z]{3}.

Am going off the top of my head with no caffeine though so YMMV.

0 Karma

sk314
Builder

try this: [your search query] | rex _raw "15=(?P<currency_a>\w+)" | rex _raw "55=(?P<currency_b>\w+)" | eval return_value = if( isnull(currency_a), currency_b, currency_a)

assuming the 'a' field is always preceded by '15=' and 'b' field is always preceded by '55='

0 Karma

David
Splunk Employee
Splunk Employee

Probably better (definitely simpler) to do | eval return_value = coalesce(currency_a, currency_b)

0 Karma

philallen1
Path Finder

I'm not sure if I need to put these @ signs in to get people's attention, so I'm giving it a go!
@ppablo_splunk
@sk314
@lguinn
@strive

0 Karma

philallen1
Path Finder

Hi guys

Thanks for the responses. Here are some sample logs.

2014-08-29 16:09:00,830 INFO Order filled for 8=FIX.4.4|9=667|15=GBP|55=USD

2014-08-27 16:11:19,373 INFO Order filled for 8=FIX.4.4|9=667|55=USD

So the fields I am referring to are "15=..." and "55=...". I'm interested in the currency that each of these fields give.

The "15=..." field is the equivalent to my "a" field - i.e. it doesn't always exist. When it does exist I want the currency that it gives, otherwise I want the currency that "55=..." gives.

I hope that makes sense?

Phil

0 Karma

strive
Influencer

You question clearly states that a and b are two separate fields, but your example shows that a is value of a field.

Assuming you will have a and b as two distinct fields, you need coalesce function.

try like this:
Your base search...| eval RequiredField = coalesce(a,b)

Read more on coalesce here: http://docs.splunk.com/Documentation/Splunk/6.1.3/SearchReference/Commonevalfunctions

As @ppablo_splunk mentioned, if you can give sample log events, it will be helpful for us to answer your question.

0 Karma

lguinn2
Legend

Why does it have to be "the same bit of regex"??

sk314
Builder

Since "'b' always exists, but 'a' sometimes doesn't"
Have you tried using fillnull on field 'a'? like so:

[your search] | fillnull value=0 a | rex a "regex for field a" | rex b "regex for field b" | eval return_value = if(isnull(a), b, a)

0 Karma

sbotharaj
New Member

Worked perfectly for my situation

0 Karma

ppablo
Retired

Hi @philallen1

This is a great question. It'll be really helpful if you could paste a sample of your data so people can help you figure out an accurate regex for your use case.

0 Karma
Get Updates on the Splunk Community!

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...