This is day 2 working with splunk. I want to extract a portion of an xml printout in the logs. My regex works fine, but splunk does not let me continue. Note that not all my events will have a match for my regex - in that case I want the field to just be blank.
Am I doing something wrong here?
\w|\W+<externalBANID>[0-9]+
You don't have a capturing group in your regex string. Splunk won't extract a field without one.
as suggested above a capture group is needed. also the field name is needed within the capture group.
that alone did not work for me. I actually needed to add ?P after the first parenthesis in the capture group. as an example.
(?PYourRegex),
The best way I've found to learn (or teach) this topic is to use the GUI feature at first rather than try to write your own regex from scratch. If you have a complex pattern you think it won't pick up on, using the 'write my own' is certainly more robust, but you can grab the syntax and save yourself a lot of time digging using the 'Show Regular Expression" link using the regular GUI flow (rather than the "I prefer to write my own")
Hi Snalonzo,
Thanks for the suggestion - I tried that after reading your post and it can't seem to figure out the field correctly. I think this will work for most other fields.
Part of my problem is that I'm trying to parse out xml fields from within a log file that has a bunch of other java/weblogic text based noise in it.
This did the trick:
\(?[0-9]+)\<\/ns3:externalBANId\>
it pulled out the digits between the two tags and assigned it the BAN_ID name.
Thank you!
You don't have a capturing group in your regex string. Splunk won't extract a field without one.
@richgalloway is correct, you need to wrap your regex in a capturing group, ()
Thanks for the reply. Now I am here.
([0-9]+)<\/ns3:externalBANId>
However, it is still not letting me save, so something is still wrong with my regex. I defined the capturing group as the set of digits between those two strings. Still, it doesn't seem to like it.
(<whatever_name_you_want>[0-9]+)
With the above, ([0-9]+), you are matching a number between 0-9, 1 or more times, but are not naming that anything, so its not letting you save that thing (because Splunk would do nothing with that matching).
Anything outside of the parenthesis is outside of the capture, the first thing in the paren should be
<fieldname>
then the pattern you want to extract, then close paren, then anything after that pattern that further restricts the match.
Regex can be tricky at first, and certainly Splunk has its own regex quirks, but it gets easier - we promise 🙂
(?[0-9]+)<\/ns3:externalBANId>
I got it! Thanks for the help. Defining the field name was the part that I was missing.
Thanks! I think I need to do a little more regex homework to get this to work the way I want. I really appreciate the quick responses!