Splunk Search

categorize or classify dissimilar field values at search time?

mitag
Contributor

The goal is to generate a new field "Category" and assign it an arbitrary value (e.g. "Error") depending on which regex matches the event. Details:

One of our applications generates quite a number of different types of events, and I'd like to classify or categorize them (or perhaps tag them at search time?) by matching them to dissimilar regexes. Each category would match a regex that is specific to that category. I tinkered with this for a while and found that writing a single regex that would match them all by combining a number of OR groups ( | rex field=Message "(?<Category>regex1|regex2|regex3|...|regex26)") may be too complex. But even if I could do that, I still can't figure out how to generate an arbitrary value that is different from a capture groop - e.g. Category="NDR" for a number of dissimilar non-delivery reports such as "this message could not be delivered", "delivery failure", etc.

Is there an option of an "if-elseif" or "switch" or "case" statement in SPL that would let me assign arbitrary field values depending on which regex matches the event?

E.g. something like,

IF fieldvalue MATCHES regex1 THEN Category="Cat1"
ELSEIF fieldvalue MATCHES regex2 THEN Category="Cat2"
ELSEIF fieldvalue MATCHES regex3 THEN Category="Cat3"
# ...and so on...
ELSE Category="nomatch"

(Is it clear what I am trying to do?)

Below are event examples, with desired Category values.

For the two below: Category="Error: Unable to acquire license".

Service: Monitor was unable to acquire the OpenWorkflow license, this service will not be able to accept Actions which utilize OpenWorkflow technology.

Service: Analysis was unable to acquire the OpenWorkflow license, this service will not be able to accept Actions which utilize OpenWorkflow technology.

Category="Error: Unable to locate":

ServiceBase::Watchdog:Watch: Unable to locate hostname:service on the network.

Category="Error: Unable to communicate":

Domain::Service::ServiceEndpointAddress: Unable to communicate with hostname:service:  

Category="Warning: Service not attached to the database"

Service has detected it is not attached to the database for a prolonged amount of time -- re-attaching...

Category="Info: Service in closed workflow mode"

Service: Monitor is running in the closed workflow mode.

... etc...

Once I have those categories, I could generate much cleaner looking stats that are based on categories rather than individual events - of which there are too many.

Thank you!

P.S. Apologies for the long question - trying to be as descriptive yet as clear as possible.

0 Karma
1 Solution

to4kawa
Ultra Champion
| makeresults
| eval raw="Service: Monitor was unable to acquire the OpenWorkflow license, this service will not be able to accept Actions which utilize OpenWorkflow technology.
Service: Analysis was unable to acquire the OpenWorkflow license, this service will not be able to accept Actions which utilize OpenWorkflow technology.
ServiceBase::Watchdog:Watch: Unable to locate  on the network.
Domain::Service::ServiceEndpointAddress: Unable to communicate with :  
Service has detected it is not attached to the database for a prolonged amount of time -- re-attaching...
Service: Monitor is running in the closed workflow mode."
| makemv delim="
" raw
| mvexpand raw
| rename raw as _raw
| eval category = case(match(_raw,"Unable to locate"),"Error: Unable to locate"
,match(_raw,"(?=unable to acquire ).+ license"),"Error: Unable to acquire license"
,match(_raw,"Unable to communicate"),"Error: Unable to communicate"
,match(_raw,"not attached to the database"),"Warning: Service not attached to the database"
,match(_raw,"closed workflow mode"),"Info: Service in closed workflow mode"
,true(),"nomatch")

Hi, @mitag
I think you will write 26 regular expressions.
There are only 21 remaining.

View solution in original post

to4kawa
Ultra Champion
| makeresults
| eval raw="Service: Monitor was unable to acquire the OpenWorkflow license, this service will not be able to accept Actions which utilize OpenWorkflow technology.
Service: Analysis was unable to acquire the OpenWorkflow license, this service will not be able to accept Actions which utilize OpenWorkflow technology.
ServiceBase::Watchdog:Watch: Unable to locate  on the network.
Domain::Service::ServiceEndpointAddress: Unable to communicate with :  
Service has detected it is not attached to the database for a prolonged amount of time -- re-attaching...
Service: Monitor is running in the closed workflow mode."
| makemv delim="
" raw
| mvexpand raw
| rename raw as _raw
| eval category = case(match(_raw,"Unable to locate"),"Error: Unable to locate"
,match(_raw,"(?=unable to acquire ).+ license"),"Error: Unable to acquire license"
,match(_raw,"Unable to communicate"),"Error: Unable to communicate"
,match(_raw,"not attached to the database"),"Warning: Service not attached to the database"
,match(_raw,"closed workflow mode"),"Info: Service in closed workflow mode"
,true(),"nomatch")

Hi, @mitag
I think you will write 26 regular expressions.
There are only 21 remaining.

mitag
Contributor

Thank you!

0 Karma

to4kawa
Ultra Champion

you're welcome. I'll make it if you provide the rest.

mitag
Contributor

Appreciate that - will let you know if I run into problems.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...