Hello to all..
I am attempting (partially succesfully so far) to extract some text. The problem I am having is that it is also extracting unwanted text past the vaue I am (obviously incorrectly) specifying as the end point.
The string I am trying to extract is (in this example) ALEXANDRIA
ALEXANDRIA (attempting to extract the text between > and <)
The expression I am using is
rex field=_raw "\(?\S+)\<"
However, when I run the search, I also get the proceeding text in the returned value below:
ALEXANDRIANSW2015AUAustralia
As I say it is sort of working but I am unsure as to how to instruct the expression to stop at the <
after the suburb name.
Any help or pointers will be gratefully accepted.
---update--
The input string is
<mm:SuburbName>ALEXANDRIA</mm:SuburbName>
The suburb will vary
The output I am getting is
ALEXANDRIA</mm:SuburbName><mm:StateOrProvinceCode>NSW</mm:StateOrProvinceCode><mm:PostalCode>2015</mm:PostalCode><mm:CountryCode>AU</mm:CountryCode><mm:CountryName>Australia</mm:CountryName>
Cheers all.
Alastair
Try with this. Seems to work for the same data you have.
rex field=t "\<mm\:SuburbName\>(?<suburb>\w+)\<.*"
Try with this. Seems to work for the same data you have.
rex field=t "\<mm\:SuburbName\>(?<suburb>\w+)\<.*"
Fantastic... thank you very much for your help and sorry for the confusion in getting the required data posted 🙂
Cheers.
Alastair
Missed something quite important, the suburb name could include a space which the above answer will not accept as a valid input.
rex field=_raw "\<mm\:SuburbName\>(?<suburb>[a-zA-Z ]*)\<.*"
Should work better as a solution there is a space between Z and ] to allow whitespace as an acceptable value in the Suburb Name.
Ah.. yes that is better.. I was wondering why there were no suburbs appearing with more than one name component.
Thank you so much again fro all your help.
Cheers
Alastair
Let me introduce you to my personal savior: RegEx101.com
(?i)SuburbName\>(?P\w+)\<
Thank you for the site link... this will definitely come in handy.
Cheers,
Alastair
In addition to @leathej1's resource, this previous Answers post has a bunch of great regex sites as well in case you're interested.
http://answers.splunk.com/answers/153171/is-there-any-online-regex-tool-to-create-regular-e.html
An excellent page full of rather good resources.
Thank you for providing this.
Cheers,
Alastair
Can you share a sample of the data set you are trying to work with?
Please enclose the example within the code sample (5th button on the textbox toolbox) so that the brackets arent removed.
Hello...
Sorry was just trying to work out how to do that 🙂
The expression I am using is rex field=_raw "\(?\S+)\<"
and the output I am getting is
ALEXANDRIANSW2015AUAustralia
Hope this is as needed
We would need to see the input event so that we can help with the regex query.
`rex field=_raw "\<mm\:SuburbName+\>(?<Suburb>\S+)\<"`
The input string is
<mm:SuburbName>ALEXANDRIA</mm:SuburbName>
The suburb will vary
The output I am getting is
ALEXANDRIA</mm:SuburbName><mm:StateOrProvinceCode>NSW</mm:StateOrProvinceCode><mm:PostalCode>2015</mm:PostalCode><mm:CountryCode>AU</mm:CountryCode><mm:CountryName>Australia</mm:CountryName>
So I am trying to extract the text string between > and < in this case ALEXANDRIA
Arrgghh.. will try again
RegEx = "rex field=_raw "\(?\S+)\<"
"
Output
"ALEXANDRIANSW2015AUAustralia
"
Sorry... cannot get the RegEx string to display. Have tried using both and "`" but the string keeps getting chopped off.
Any other suggestions ?