I am having trouble searching mutliple patterns using rex. I have the log files containg the following pattern lines:
I want to get table report on Processtype and the docCount, which in this example case is 12345.
I can search for PROCESSTYPE
sourcetype="SOMETHING SOMETHING" | rex field=_raw ".*, (?<1ST>[A-Z][a-z]+) :.\*"
I can also search for docCount
sourcetype="SOMETHING SOMETHING" | rex field=_raw ".* \<(?<2ND>[0-9]+)\>"
But when I combine the two together like this, it doesn't return any result
sourcetype="SOMETHING SOMETHING" | rex field=_raw ".*, (?<1ST>[A-Z][a-z]+) : \<(?<2ND>[0-9]+)\>"
What am I doing wrong?
I have found that there can occasionally be a problem with colon's in a regex probably due to EREG syntax compatibility. While I agree that regex builders on various web pages can be useful, they are rarely fully functional for complex regex building.
You may want to try escaping all non-alphanumeric characters in your regex to be on the safe side. I also find that sometimes whitespace can be irregular so that would be the second recommendation I would give you. Taking your example as a base, you may want to try the following regex:
sourcetype="SOMETHING SOMETHING" | rex field=_raw ".*,\s+(?<1ST>[A-Z][a-z]+)\s+\:\s+\<(?<2ND>[0-9]+)\>"
I have found that there can occasionally be a problem with colon's in a regex probably due to EREG syntax compatibility. While I agree that regex builders on various web pages can be useful, they are rarely fully functional for complex regex building.
You may want to try escaping all non-alphanumeric characters in your regex to be on the safe side. I also find that sometimes whitespace can be irregular so that would be the second recommendation I would give you. Taking your example as a base, you may want to try the following regex:
sourcetype="SOMETHING SOMETHING" | rex field=_raw ".*,\s+(?<1ST>[A-Z][a-z]+)\s+\:\s+\<(?<2ND>[0-9]+)\>"
Glad to hear that you found what helps. What is the regex that ended up resolving this for you?
I believe it was because the greediness of the regex search. Excluding [^ ] a whole bunch of characters seem to have helped.
Hey, as far as I can tell there is nothing technically incorrect with the regex pattern itself. Having dealt with a similar situation in the past, I would have to ask if you are using capture field names which actually begin with a digit. I recall reading somewhere, a long time ago, about acceptable naming for capture groups and something that led me to change the name of the capture. Sorry I cannot link it but it was a long time ago and a fluke.
More explicitly, are you using 1ST and 2ND, literally? If that were the case, then you will have to use a name which begins with a permissible consonant character. Meaning the first character in your captuer field should be a-z or A-Z.
By the way, using your regex, quoted above, fails with RegEx Builder, Regex Buddy and RegExr. If I alter the regex just slightly to
.*, (?<first>[A-Z][a-z]+) : <(?<second>[0-9]+)>
then it works O.K. The same is true in Splunk.
I hope this helps.
Thank you for confirming. My thought is that, if my regex is wrong, I shouldn't get any result back when I search for each separately. I get correct words when I search those one at a time. I just have problem when I combine them together.
No, there's no restriction in the config. Consider checking out one of the regex sites listed above to validate your regex vs. the data. My personal favorite is RegExr.
Hi,
My rex syntax got alter in the message posting. I don't know how to post as a code. My variables don't start with numbers and my sytax is exactly like what you posted.
Is there something to do with the configuration that I can't create more than one variable in rex search?