Splunk Search

Regex to Extract String That is Between 2 Fixed Words

skoelpin
SplunkTrust
SplunkTrust

I have 61 events which have a string between ''and ''

There's 3-4 different phrases that go between those 2 fixed strings. So I need a regular expression which can pick up whatever phrase is between ''and ''.

Example:

<a:OrderMessage>Missed Delivery cut-off, Redated</a:OrderMessage>

Phrase = "Missed Delivery cut-off, Redated"

Tags (2)
0 Karma
1 Solution

stephanefotso
Motivator

You can also test this:

 ...|rex "(?i):OrderMessage>(?P<phrase>[^<]+)"|table phrase
SGF

View solution in original post

altink
Builder

What about extracting fields from the following single-event XML ?

<CONTROL> 
<VLN_ID>1001</VLN_ID>
<VLN_NAME>vulnerability name 001</VLN_NAME>
<VLN_SEVERITY>2</VLN_SEVERITY>
<VLN_CATEGORY>Audit</VLN_CATEGORY> 
<VLN_SCAN_CODE>0</VLN_SCAN_CODE>
<VLN_SCAN_MESSAGE>successfully completed</VLN_SCAN_MESSAGE>
<VLN_CTRL_FIND>0</VLN_CTRL_FIND>
<VLN_CTRL_SUMMARY>ALDO1, ALTIN1</VLN_CTRL_SUMMARY>
<VLN_CTRL_OUTPUT>xxxxxxxxxxxxxx xxxxxxxxxxxxxx xxxxxxxxxx</VLN_CTRL_OUTPUT>
</CONTROL>

without REX or SPATH, how can one get the field just between two unique strings (XLM field tags) ?

regards
Altin

0 Karma

richgalloway
SplunkTrust
SplunkTrust

You're adding to an old discussion that has an accepted answer. It's unlikely your question will be seen. Please post a new question.

---
If this reply helps you, Karma would be appreciated.
0 Karma

stephanefotso
Motivator

You can also test this:

 ...|rex "(?i):OrderMessage>(?P<phrase>[^<]+)"|table phrase
SGF

skoelpin
SplunkTrust
SplunkTrust

Thanks for the help. I put in your query and getting an 'Error in rex command'

Here's my search

...| rex "\<a:OrderMessage\>(?P<*>.*?)\<\/a:OrderMessage\>"

I put a '*' where you had 'phrase'. I have 4-5 phrases/strings that will show up so I can't hardcode the phrase/string in the regex. Any other ideas?

0 Karma

stephanefotso
Motivator

Why? you don't have to change anything in the search query, instead add your basic search to replace the ... before the |rex . The regex will extract all the values you need, create a field named phrase and put all that values inside.Test it and let me know if you still have issues

SGF
0 Karma

skoelpin
SplunkTrust
SplunkTrust

Yes exactly what I was looking for. The only issue now is that the date after "Missed Delivery cut-off, Redated to" can change and I only want to grab that phrase once and have it count each instance, regardless of what the date is.

Can you help with new regex which could only count the different phrases and ignore the dates?

Here is what I'm seeing

"Missed Delivery cut-off, Redated to (01/15/15)"
"Missed Delivery cut-off, Redated to (01/19/15)"
"Missed Delivery cut-off, Redated to (01/25/15)"

And I want it to only capture the this string and have it count the occurrences

 "Missed Delivery cut-off, Redated to" 
0 Karma

stephanefotso
Motivator

Ok here you go

...|rex "(?i):OrderMessage>(?P<phrase>[^(<]+)"|table phrase

For the count you can use stats command instead of table, depending of what you want

SGF

richgalloway
SplunkTrust
SplunkTrust

Something like this?

... | rex "\<a:OrderMessage\>(?P<Phrase>.*?)\<\/a:OrderMessage\>" | ...
---
If this reply helps you, Karma would be appreciated.
0 Karma

skoelpin
SplunkTrust
SplunkTrust

Thanks for the help. I put in your query and getting an 'Error in rex command'

Here's my search

... | rex "\<a:OrderMessage\>(?P<*>.*?)\<\/a:OrderMessage\>"

I put a '*' where you had 'phrase'. I have 4-5 phrases/strings that will show up so I can't hardcode the phrase/string in the regex. Any other ideas?

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Asterisks are not valid there. The word 'phrase' is a field declaration, not a hardcoding. When the rex command executes, it will store the string it finds between the two fixed strings in the field called 'phrase' which you then can use in other SPL commands.

---
If this reply helps you, Karma would be appreciated.
0 Karma

skoelpin
SplunkTrust
SplunkTrust

So I need to create a field called 'phrase'?

0 Karma

richgalloway
SplunkTrust
SplunkTrust

The rex command will create the field. Please try the regex string I gave you.

---
If this reply helps you, Karma would be appreciated.
0 Karma

skoelpin
SplunkTrust
SplunkTrust

This worked almost perfectly. The only issue now is

One of the Phrases is "Missed Delivery cut-off, Redated to (01/15/15)"

There will be different dates after the 'Missed Delivery cut-off, Redated' and the regex you gave me sorts the phrases with a different date as an independent event.

So I would like to count 'Missed Delivery cut-off, Redated to' as one regardless of what the date is.

0 Karma

skoelpin
SplunkTrust
SplunkTrust

I basically want to ignore numbers and only identify letters

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Try this:

... | rex "\<a:OrderMessage\>(?P<Phrase>.*?)\<\/a:OrderMessage\>" | rex field=Phrase "(?P<truncatedPhrase>[^\d\(]+)[\d\(]" | stats count by truncatedPhrase
---
If this reply helps you, Karma would be appreciated.
0 Karma

skoelpin
SplunkTrust
SplunkTrust

This is very very close to what I need. It successfully counts the number of instances and ignores all the numbers.

Only 2 issues now,

1) I have 4 different phrases/strings and your search is only counting 3 of them (missing the 4th one.. FRD).

2) One of the phrases is "Pulled ship date of 04/17/15 on Express because Customer Master flagged as HLD." and it's currently showing up as "Pulled ship date of".. I need it to show "Flagged as HLD"

Here are the 4 phrases/strings

1) Existing account, Changed phone from 1111111111 to 2222222222
2) Missed Delivery cut-off, Redated to 04/18/2015
3) Pulled ship date of 04/17/15 on Express because Customer Master flagged as HLD.
4) Pulled ship date of 04/17/15 on Express because Customer Master flagged as FRD.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

I have not been able to produce a single regex string that will match all four of those strings. Perhaps you can use sed to replace numbers with another character.

... | rex "\<a:OrderMessage\>(?P<Phrase>.*?)\<\/a:OrderMessage\>" | rex field=Phrase mode=sed "s/\d/x/g" | stats count by Phrase
---
If this reply helps you, Karma would be appreciated.
0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...