Splunk Search

Field extractions using regex not working

Path Finder

Hi,

I am looking to extract a field from the raw event using the below regex:

.*<name>(?<parameter_name>[^\<]+)

It should extract a string between 2 XML tags.
The extraction is working fine using rex command, when added to the Field extractions the extraction is not happening.
The configuration is defined in the Search and reporting app with Global read permission:

etc/apps/search/local/props.conf

[sourcetype]
EXTRACT-parameter_name = .*<name>(?P<parameter_name>[^<]+)
EXTRACT-parameter_value = .*<value>(?P<parameter_value>[^<]+)

Note: other extractions are present in the same file and are working well

Any ideas what could be the catch here?

Thanks

1 Solution

Path Finder

Earlier the field was not being populated to "Interesting fields", but after narrowing down the search and piping to a table I am able to see it correctly.

For the record I am still using the same initial configuration as quoted in the question, regex in props.conf on the Search Head.
I am still not sure why the field cannot be seen when I search only for the sourcetype, even though it exists in around 20% of the events.

Thanks everyone for your help.

View solution in original post

0 Karma

Path Finder

Earlier the field was not being populated to "Interesting fields", but after narrowing down the search and piping to a table I am able to see it correctly.

For the record I am still using the same initial configuration as quoted in the question, regex in props.conf on the Search Head.
I am still not sure why the field cannot be seen when I search only for the sourcetype, even though it exists in around 20% of the events.

Thanks everyone for your help.

View solution in original post

0 Karma

Esteemed Legend

The other main thing to check is that your sourcetype is correct. When you say "others work" are they in the same stanza (under the same [sourcetype] header? If not, this could be your problem, especially if you have overriden your sourcetype. Using btool can be very handy here.

0 Karma

Path Finder

By others I mean other extractions defined in the same file but for different sourcetypes.
Regarding your earlier comment for validating the regex; it works well with the rex command.
This is a distributed environment, do you think it will make a difference if I define the extraction on th HF?

0 Karma

Legend

Field extractions are search-time - this belongs on the search head, or wherever it is that users log-in to search.

0 Karma

Esteemed Legend

Your syntax is fine so I have to assume your RegEx does not match your data. Typically this happens because of some whitespace that you did not notice is there. You should validate your RegEx with a tool like Expresso because I am sure that's where the problem is.

I am assuming that you would ideally like to have the 1 value be the name of your field and the other value be the value of your field but that you didn't think this is possible, but it is. Read all about it here:

http://answers.splunk.com/answers/7320/given-two-fields-how-can-i-create-a-third-field-whose-name-is...

Path Finder

Indeed I didn't know that was possible, but unfortunately this doesn't apply to my case.
Thanks for the interesting share.

0 Karma

Legend

The left angle bracket (<) is a special character in regular expressions. You should escape it like this:

[sourcetype]
 EXTRACT-parameter_name = .*\<name>(?P<parameter_name>[^<]+)
 EXTRACT-parameter_value = .*\<value>(?P<parameter_value>[^<]+)

I think that should fix it. BTW, the regular expressions in extracts (and most other places in Splunk) are not anchored, so you can do this:

[sourcetype]
 EXTRACT-parameter_name = \<name>(?P<parameter_name>[^<]+)
 EXTRACT-parameter_value = \<value>(?P<parameter_value>[^<]+)

Legend

FYI, voting me down doesn't motivate me to research this further.

0 Karma

Esteemed Legend

He did not vote down your answer, I did. Voting ( up and down ) is a responsibility that we all have even though it is expensive (lowers Karma). I downvoted (with a very gentle correctionary comment), and also encouraging downvoting, for all of these reasons:
1: It moves the incorrect answer to the bottom so people can focus on (potentially) correct answers and will not waste time on incorrect answers.
2: It discourages people from posting answers that they should not (haven't tested, aren't sure, etc.).
3: It hopefully educates the user who had the wrong answer (it was good for you).

I have been downvoted many times and it was good for me almost every time. I don't like being wrong, but I dislike being ignorant even more.

0 Karma

Splunk Employee
Splunk Employee

@Woodcock its kind of unnecessary when the vote hasn't been upvoted or accepted. If so, the author of the question could still edit and fix the question, have a comment-dialogue with the asker, etc.

In this case, there hasn't been any votes or acceptance, so its a bit preemptive.

If there were 5 answers, and one of them is definitely better than the others, AND the incorrect one had been accepted, all your points start to make a bit more sense.

0 Karma

Legend

So my answer didn't solve the problem, but did it contain any incorrect information?

0 Karma

Esteemed Legend

That is why I downvoted instead of deleted the answer (which I could have done), which the author may still do, but I hope not because I referenced this thread as background in a new question:

http://answers.splunk.com/answers/244111/proper-etiquette-and-timing-for-voting-here-on-ans.html

Splunk Employee
Splunk Employee

You could've deleted it?

0 Karma

Esteemed Legend

This is a privilege earned at the 2000-point karma level:

http://docs.splunk.com/Documentation/Splunkbase/splunkbase/Answers/HowtoearnKarma#Karma_rewards

And before anyone says "No you can't", I would first have to convert the Answer to a Comment and then Delete the Comment.

0 Karma

Champion

If I'm not mistaken, the angular bracket in the capturing group should be escaped as well.

0 Karma

Path Finder

Neither of these combinations is working.
Thanks

0 Karma

Esteemed Legend

Although the angle brackets ("<>") do have a special meaning in naming capture groups, they do not need to be escaped; it is the question mark ("?") that marks the token that needs to be escaped. So this cannot be the proble

Legend

Agree: the angle brackets in the capture groups should not be escaped.

0 Karma

Champion

Oh, that's new to me. Thanks for sharing.

0 Karma

Champion

Just to clarify, are there XML tags literally named name and value in your data, and you are searching the same set of events with rex and with the field extraction?

0 Karma
State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!