Hello Splunkers,
i want to to extract a 10-digit path from a url but unfortunately i always get this error:
Error in 'rex' command: The regex '.*\/(([0-9a-z]{10}))' does not extract anything. It should specify at least one named group. Format: (?<name>...).
how ever, i want to extract the path from this URL https://example.com/8a2a6063b3
this is the search i used
index=FP_proxy | rex field=url "http[s]?:\/\/[a-zA-Z0-9-]{1,}\..*\/(([0-9a-z]{10})?<url__path>)"
your help is much needed and appreciated to fix this issue
Hi @msalghamdi , the capturing group in your regex has the wrong format. Try moving the label to the beginning. Like this:
index=FP_proxy | rex field=url "http[s]?:\/\/[a-zA-Z0-9-]{1,}\..*\/(?<url__path>[0-9a-z]{10})"
Which is going to extract 8a2a6063b3 into the url__path field.
Hope that helps.
Thanks,
J
Hi @msalghamdi , the capturing group in your regex has the wrong format. Try moving the label to the beginning. Like this:
index=FP_proxy | rex field=url "http[s]?:\/\/[a-zA-Z0-9-]{1,}\..*\/(?<url__path>[0-9a-z]{10})"
Which is going to extract 8a2a6063b3 into the url__path field.
Hope that helps.
Thanks,
J
thanks javiergn
one more question please, i want to apply a condition on the extracted field that is must exist, here's my search:
index=proxy | rex field=url "http[s]?:\/\/[\w]{1,}\.[\w]{1,}\/(?<ppp>[0-9a-z]{10})$"
| where ppp=*
| table _time src dest_ip dest user ppp url status
but i get this error:
Error in 'where' command: The expression is malformed. An unexpected character is reached at '* '.
what can i do to fix this ?
thanks
Hi, use search instead of where and problem solved.
Or you could also do | where isNotNull(ppp)