Hi Splunkers, I am trying to extract a string within a string, which has been repeated, with the addition of some pre- and -post fixes, only the very start and end of the string are static values ('AZ-' and '-VMSS').
Example data:
AZ-203-dev-app-1-build-agents-203-dev-app-1-build-agents0006GA-1720624093-VMSS
AZ-eun-dev-005-pqu-ado-vmss-eun-dev-005-pqu-ado-vmss005X89-1720625975-VMSS
AZ-DEV-CROSS-SUBSCRIPTION-PROXY-EUN-BLUE-DEV-CROSS-SUBSCRIPTION-PROXY-EUN-BLUE000000-1720637733-VMSS
I have a working rex command to extract the relevant data (temp_hostname4):
| rex field=source_hostname "(?i)^AZ(?<cap1>(-[A-Z0-9]+)+)(?=\1[A-Z0-9]{6})-(?<temp_hostname4>([A-Z0-9]+-?)+)-\d{10}-VMSS$"
Which correctly extracts:
203-dev-app-1-build-agents0006GA
eun-dev-005-pqu-ado-vmss005X89
DEV-CROSS-SUBSCRIPTION-PROXY-EUN-BLUE000000
But let's face it, this is horrible! According to regex101 this takes 46K+ steps, which can't be nice for Splunk to apply to c.20K records several times per day.
Can anyone suggest optimisations to bring that number down?
For added complication (and for clarity to anyone reading this) it's temp_hostname4 because there are multiple other ways the hostname might have been... manipulated before it gets to Splunk, sometimes with the string repeated, sometimes not, resulting in the following SPL - I could use coalesce rather than case, but that's hardly important right now, and separating the regex statements seemed like the saner thing to do in this instance 😉
| rex field=source_hostname "(?i)^AZ(?<cap1>(-[A-Z0-9]+)+)(?=\1[A-Z0-9]{6})-(?<temp_hostname4>([A-Z0-9]+-?)+)-\d{10}-VMSS$"
| rex field=source_hostname "(?i)^AZ-(?<temp_hostname3>[^.]+)-\d{10}-VMSS$"
| rex field=source_hostname "(?i)^AZ-(?<temp_hostname2>[^.]+)-\d{10}$"
| rex field=source_hostname "(?i)^(?<temp_hostname1>[^.]+)_\d{10}$"
| eval alias_source_of=case(
!isnull(temp_hostname4), temp_hostname4,
!isnull(temp_hostname3), temp_hostname3,
!isnull(temp_hostname2), temp_hostname2,
!isnull(temp_hostname1), temp_hostname1,
1=1, null()
)
Any suggestions for optimisations of the regex would be gratefully appreciated.
... View more