Hello, I just started a new position where I've inherited management of large queries that need to be updated periodically. They typically involve having regexes matching on a field and applying a label to them. One involves a huge case statement:
| eval label=case(match(field,"regex1",label1),match(field,"regex2",label2),match(field,"regex3",label3)...)
The regex is updated regularly, hence me wanting to make this more manageable. My first thought was to use a lookup table with the regex & label but I'm open to other suggestions.
I did find https://community.splunk.com/t5/Splunk-Search/How-do-I-match-a-regex-query-in-a-CSV and have been able to use regex in the lookup table with a search that was suggested in the solution:
| where [| inputlookup regexlookup.csv | eval matcher="match(subject,\"".regex."\")" | stats values(matcher) as search | eval search=mvjoin(search. " OR ")]
But I'm wondering how to also apply the label to the results with a lookup like this:
regex,label
regex1,label1
regex2,label2
Thanks in advance.
Assuming your lookup file isn't too big, you could try something like this
| append
[| inputlookup regexlookup.csv
| eval regexlabel=regex.":".label
| stats list(regexlabel) as regexlabel]
| reverse
| filldown regexlabel
| where isnotnull(subject)
| streamstats count as _row
| mvexpand regexlabel
| eval regex=mvindex(split(regexlabel,":"),0)
| eval label=mvindex(split(regexlabel,":"),1)
| eval label=if(match(subject,regex),label,null())
| stats first(_time) as _time first(*) as * by _row
| sort 0 _row
| fields - _row regex regexlabel
Assuming your lookup file isn't too big, you could try something like this
| append
[| inputlookup regexlookup.csv
| eval regexlabel=regex.":".label
| stats list(regexlabel) as regexlabel]
| reverse
| filldown regexlabel
| where isnotnull(subject)
| streamstats count as _row
| mvexpand regexlabel
| eval regex=mvindex(split(regexlabel,":"),0)
| eval label=mvindex(split(regexlabel,":"),1)
| eval label=if(match(subject,regex),label,null())
| stats first(_time) as _time first(*) as * by _row
| sort 0 _row
| fields - _row regex regexlabel
It is not clear what the end goal is. Do you mean to make Splunk output SPL for the purpose of "updating" existing queries? To output "updated" queries (SPL) from the previous version?
Can your regexes be grouped by sourcetype, source or host? If so, you can add your case statements to the Calculated fields which will help you group your data.
Settings > Fields > Calculated fields.
If you want all the fields labeled as "label", you can only have one calculated expression per sourcetype, host, or source.
I used the _internal index and created case statements for the component field is Splunk and set the field to "label1" for all case statements.