We are trying to carry out a field extraction in a log that contains xml output.
We have worked out the regex to get the data we need:-
<id[^>]*>([LAR]+)(.*?)<\/id>
But splunk will not let us complete the extraction by adding the naming section.
We tried the below but the then extraction stops working.
<id[^>]*>([LAR]+)(.*?)<\/id>(?P<tradeId>)
Question is what is the regex syntax to add the name of the field extraction to the full regex command.
Is this for extracting fields during a search or before the search via props/transforms? If doing in search what you will want is the "rex" command. The rex command is used for extractions, whereas the regex command is used for filtering. To extract using rex you can try:
| rex field=_raw "<id[^>]*>(?<tradeId>([LAR]+)(.*?))<\/id>"
Here is the reference with examples: http://docs.splunk.com/Documentation/Splunk/7.0.2/SearchReference/Rex
If using props.conf
[source/sourcetype/etc]
EXTRACT-tradeId = <id[^>]*>(?<tradeId>([LAR]+)(.*?))<\/id>
I'm assuming here you want what is between <id ...></id>
captured. It also looks like this requires the field to begin with "L", "A", or "R". Please let me know if this helps. If not, could you post an example event and how you would like it extracted? Thanks!
Any chance you could post some sample source data and the exact value you'd like to extract from it? Or, even better, a link to a regex101 page where you've successfully extracted what you want? Translating that into SPL will be a cinch!