Splunk Search

Best way to turn large regex with labels into something manageable?

jasmartin
Explorer

Hello, I just started a new position where I've inherited management of large queries that need to be updated periodically. They typically involve having regexes matching on a field and applying a label to them.  One involves a huge case statement:

| eval label=case(match(field,"regex1",label1),match(field,"regex2",label2),match(field,"regex3",label3)...)

The regex is updated regularly, hence me wanting to make this more manageable. My first thought was to use a lookup table with the regex & label but I'm open to other suggestions.

I did find https://community.splunk.com/t5/Splunk-Search/How-do-I-match-a-regex-query-in-a-CSV and have been able to use regex in the lookup table with a search that was suggested in the solution:

| where
    [| inputlookup regexlookup.csv
    | eval matcher="match(subject,\"".regex."\")"
    | stats values(matcher) as search
    | eval search=mvjoin(search. " OR ")]

But I'm wondering how to also apply the label to the results with a lookup like this:

regex,label
regex1,label1
regex2,label2  

 Thanks in advance.

Labels (1)
0 Karma
1 Solution

ITWhisperer
SplunkTrust
SplunkTrust

Assuming your lookup file isn't too big, you could try something like this

| append
    [| inputlookup regexlookup.csv
    | eval regexlabel=regex.":".label
    | stats list(regexlabel) as regexlabel]
| reverse
| filldown regexlabel
| where isnotnull(subject)
| streamstats count as _row
| mvexpand regexlabel
| eval regex=mvindex(split(regexlabel,":"),0)
| eval label=mvindex(split(regexlabel,":"),1)
| eval label=if(match(subject,regex),label,null())
| stats first(_time) as _time first(*) as * by _row
| sort 0 _row
| fields - _row regex regexlabel

View solution in original post

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Assuming your lookup file isn't too big, you could try something like this

| append
    [| inputlookup regexlookup.csv
    | eval regexlabel=regex.":".label
    | stats list(regexlabel) as regexlabel]
| reverse
| filldown regexlabel
| where isnotnull(subject)
| streamstats count as _row
| mvexpand regexlabel
| eval regex=mvindex(split(regexlabel,":"),0)
| eval label=mvindex(split(regexlabel,":"),1)
| eval label=if(match(subject,regex),label,null())
| stats first(_time) as _time first(*) as * by _row
| sort 0 _row
| fields - _row regex regexlabel
0 Karma

yuanliu
SplunkTrust
SplunkTrust

It is not clear what the end goal is.  Do you mean to make Splunk output SPL for the purpose of "updating" existing queries?  To output "updated" queries (SPL) from the previous version?

0 Karma

matt8679
Path Finder

Can your regexes be grouped by sourcetype, source or host? If so, you can add your case statements to the Calculated fields which will help you group your data.

Settings > Fields > Calculated fields.

If you want all the fields labeled as "label", you can only have one calculated expression per sourcetype, host, or source.

I used the _internal index and created case statements for the component field is Splunk and set the field to "label1" for all case statements.

matt8679_0-1659650163565.pngmatt8679_1-1659650193793.png

 

0 Karma
Get Updates on the Splunk Community!

Automatic Discovery Part 1: What is Automatic Discovery in Splunk Observability Cloud ...

If you’ve ever deployed a new database cluster, spun up a caching layer, or added a load balancer, you know it ...

Real-Time Fraud Detection: How Splunk Dashboards Protect Financial Institutions

Financial fraud isn't slowing down. If anything, it's getting more sophisticated. Account takeovers, credit ...

Splunk + ThousandEyes: Correlate frontend, app, and network data to troubleshoot ...

 Are you tired of troubleshooting delays caused by siloed frontend, application, and network data? We've got a ...