Solved: Re: Help with RegEX

ahogbin · ‎03-23-2015

Hello to all..

I am attempting (partially succesfully so far) to extract some text. The problem I am having is that it is also extracting unwanted text past the vaue I am (obviously incorrectly) specifying as the end point.

The string I am trying to extract is (in this example) ALEXANDRIA
ALEXANDRIA (attempting to extract the text between > and <)

The expression I am using is
rex field=_raw "\(?\S+)\<"

However, when I run the search, I also get the proceeding text in the returned value below:
ALEXANDRIANSW2015AUAustralia

As I say it is sort of working but I am unsure as to how to instruct the expression to stop at the < after the suburb name.

Any help or pointers will be gratefully accepted.
---update--
The input string is

<mm:SuburbName>ALEXANDRIA</mm:SuburbName>

The suburb will vary

The output I am getting is

ALEXANDRIA</mm:SuburbName><mm:StateOrProvinceCode>NSW</mm:StateOrProvinceCode><mm:PostalCode>2015</mm:PostalCode><mm:CountryCode>AU</mm:CountryCode><mm:CountryName>Australia</mm:CountryName>

Cheers all.

Alastair

ramdaspr · ‎03-23-2015

Try with this. Seems to work for the same data you have.

rex field=t "\<mm\:SuburbName\>(?<suburb>\w+)\<.*"

View solution in original post

ramdaspr · ‎03-23-2015

Try with this. Seems to work for the same data you have.

rex field=t "\<mm\:SuburbName\>(?<suburb>\w+)\<.*"

ahogbin · ‎03-23-2015

Fantastic... thank you very much for your help and sorry for the confusion in getting the required data posted 🙂

Cheers.

Alastair

ramdaspr · ‎03-23-2015

Missed something quite important, the suburb name could include a space which the above answer will not accept as a valid input.

rex field=_raw "\<mm\:SuburbName\>(?<suburb>[a-zA-Z ]*)\<.*"

Should work better as a solution there is a space between Z and ] to allow whitespace as an acceptable value in the Suburb Name.

ahogbin · ‎03-24-2015

Ah.. yes that is better.. I was wondering why there were no suburbs appearing with more than one name component.
Thank you so much again fro all your help.
Cheers
Alastair

leathej1 · ‎03-23-2015

Let me introduce you to my personal savior: RegEx101.com

(?i)SuburbName\>(?P\w+)\<

ahogbin · ‎03-23-2015

Thank you for the site link... this will definitely come in handy.

Cheers,
Alastair

ppablo · ‎03-23-2015

In addition to @leathej1's resource, this previous Answers post has a bunch of great regex sites as well in case you're interested.
http://answers.splunk.com/answers/153171/is-there-any-online-regex-tool-to-create-regular-e.html

ahogbin · ‎03-24-2015

An excellent page full of rather good resources.
Thank you for providing this.
Cheers,

Alastair

ramdaspr · ‎03-23-2015

Can you share a sample of the data set you are trying to work with?

Please enclose the example within the code sample (5th button on the textbox toolbox) so that the brackets arent removed.

ahogbin · ‎03-23-2015

Hello...

Sorry was just trying to work out how to do that 🙂

The expression I am using is rex field=_raw "\(?\S+)\<" and the output I am getting is
ALEXANDRIANSW2015AUAustralia

Hope this is as needed

ramdaspr · ‎03-23-2015

We would need to see the input event so that we can help with the regex query.

ahogbin · ‎03-23-2015

`rex field=_raw  "\<mm\:SuburbName+\>(?<Suburb>\S+)\<"`

ahogbin · ‎03-23-2015

The input string is

<mm:SuburbName>ALEXANDRIA</mm:SuburbName>

The suburb will vary

The output I am getting is

ALEXANDRIA</mm:SuburbName><mm:StateOrProvinceCode>NSW</mm:StateOrProvinceCode><mm:PostalCode>2015</mm:PostalCode><mm:CountryCode>AU</mm:CountryCode><mm:CountryName>Australia</mm:CountryName>

ahogbin · ‎03-23-2015

So I am trying to extract the text string between > and < in this case ALEXANDRIA

ahogbin · ‎03-23-2015

Arrgghh.. will try again

RegEx = "rex field=_raw "\(?\S+)\<""

Output

"ALEXANDRIANSW2015AUAustralia"

ahogbin · ‎03-23-2015

Sorry... cannot get the RegEx string to display. Have tried using both and "`" but the string keeps getting chopped off.

Any other suggestions ?

Help with RegEX

Data Management Digest – December 2025

Index This | What is broken 80% of the time by February?

Unlock Faster Time-to-Value on Edge and Ingest Processor with New SPL2 Pipeline ...

Join the Conversation

Help with RegEX

Data Management Digest – December 2025

Index This | What is broken 80% of the time by February?

Unlock Faster Time-to-Value on Edge and Ingest Processor with New SPL2 Pipeline ...