Splunk Search

Regex question: cut of several tags and any combination of these

dkoops
Path Finder

Hi there,

I got fields such as:
- DATABASE-DTA-PRD
- APACHE-SCM-PRD-TST
- SERVERS-PRD

Which need to be returned as:
- DATABASE
- APACHE-SCM
- SERVERS

So it should cut of -DTA, -PRD, and any combination of those, in any order. However, it should leave -SCM..

I tried
(?P.*)(?:-DTA-PRD|-PRD-TST|-TST|-DTA|-PRD)

But that doesn't work..

Tags (1)
0 Karma
1 Solution

tpflicke
Path Finder

With regards to the regular expression - the first group should be non-greedy to prevent it from gobbling up too much.

Here's my test query producing the desired result for the given examples:

| gentimes start=-1 
| eval temp="DATABASE-DTA-PRD#APACHE-SCM-PRD-TST#SERVERS-PRD" 
| table temp 
| makemv temp delim="#" 
| mvexpand temp 
| rename temp as _raw 
| rex field=_raw "(?<result>.*?)(?:-DTA-PRD|-PRD-TST|-TST|-DTA|-PRD)" 
| table _raw result

If you are not anchoring the regular expression to the end with $ you shouldn't need the -PRD-TST and -DTA-PRD matches. Under the condition that the -SCM tag is always the first tag and that there is always at least one tag you might simplify the query a bit, e.g.

| rex field=_raw "(?<result>.*?)(?:-DTA|-PRD|-TST)" 

Alternative 1
Using rex mode=sed would allow you to discard specific tags which also works if the tag order differs.

| gentimes start=-1 
 | eval temp="DATABASE-DTA-PRD#APACHE-SCM-PRD-TST#SERVERS-PRD#APACHE-PRD-TST-SCM" 
 | table temp 
 | makemv temp delim="#" 
 | mvexpand temp 
 | rex field=temp mode=sed "s/-(PRD|TST|DTA)//g" 
 | table temp

Alternative 2:
Now, if things turn really complicated the you could resort to using a lookup table

full_fieldname,short_fieldname
DATABASE-DTA-PRD,DATABASE
APACHE-SCM-PRD-TST,APACHE-SCM
APACHE-PRD-SCM-TST,APACHE-SCM
SERVERS-PRD,SERVERS
...

and use like:

| eval full_fieldname=... 
| lookup name_translation full_fieldname OUTPUT short_fieldname
| eval fieldname=if(isNull(short_fieldname), full_fieldname, short_fieldname)
| stats count by fieldname

All of these would work if you do the manipulation in the query, if you want to do this in props.conf this might not be the case.

View solution in original post

0 Karma

tpflicke
Path Finder

With regards to the regular expression - the first group should be non-greedy to prevent it from gobbling up too much.

Here's my test query producing the desired result for the given examples:

| gentimes start=-1 
| eval temp="DATABASE-DTA-PRD#APACHE-SCM-PRD-TST#SERVERS-PRD" 
| table temp 
| makemv temp delim="#" 
| mvexpand temp 
| rename temp as _raw 
| rex field=_raw "(?<result>.*?)(?:-DTA-PRD|-PRD-TST|-TST|-DTA|-PRD)" 
| table _raw result

If you are not anchoring the regular expression to the end with $ you shouldn't need the -PRD-TST and -DTA-PRD matches. Under the condition that the -SCM tag is always the first tag and that there is always at least one tag you might simplify the query a bit, e.g.

| rex field=_raw "(?<result>.*?)(?:-DTA|-PRD|-TST)" 

Alternative 1
Using rex mode=sed would allow you to discard specific tags which also works if the tag order differs.

| gentimes start=-1 
 | eval temp="DATABASE-DTA-PRD#APACHE-SCM-PRD-TST#SERVERS-PRD#APACHE-PRD-TST-SCM" 
 | table temp 
 | makemv temp delim="#" 
 | mvexpand temp 
 | rex field=temp mode=sed "s/-(PRD|TST|DTA)//g" 
 | table temp

Alternative 2:
Now, if things turn really complicated the you could resort to using a lookup table

full_fieldname,short_fieldname
DATABASE-DTA-PRD,DATABASE
APACHE-SCM-PRD-TST,APACHE-SCM
APACHE-PRD-SCM-TST,APACHE-SCM
SERVERS-PRD,SERVERS
...

and use like:

| eval full_fieldname=... 
| lookup name_translation full_fieldname OUTPUT short_fieldname
| eval fieldname=if(isNull(short_fieldname), full_fieldname, short_fieldname)
| stats count by fieldname

All of these would work if you do the manipulation in the query, if you want to do this in props.conf this might not be the case.

0 Karma

dkoops
Path Finder

You sir, are a lifesaver! Alternative 1 did the trick. Also while not needed, great idea using a lookup table, don't know why I didn't think of that myself.

0 Karma
Get Updates on the Splunk Community!

See just what you’ve been missing | Observability tracks at Splunk University

Looking to sharpen your observability skills so you can better understand how to collect and analyze data from ...

Weezer at .conf25? Say it ain’t so!

Hello Splunkers, The countdown to .conf25 is on-and we've just turned up the volume! We're thrilled to ...

How SC4S Makes Suricata Logs Ingestion Simple

Network security monitoring has become increasingly critical for organizations of all sizes. Splunk has ...