Splunk Enterprise

Regex to extract fields between pipe

jacqu3sy
Path Finder

Hi,

I'm indexing events in JSON format and I need a way of extracting into individual fields the values broken up by the pipe in the 'Subject' field seen below;

RecipientAddress: bla@bla.com
SenderAddress: fred@fred.com
Size: 201828
Status: FilteredAsSpam
Subject: 1|fdbe21c9-xxxxx|195.168.1.1|Comms@fred.com|([Ext]Hi, join us for the 10-year roundup) 12/11/2020 8:21:14 AM
ToIP: null

I seem to be struggling to get a regex to work, not sure whether I need to take into account the JSON formatting?

Thanks.

Labels (1)
0 Karma
1 Solution

nickhills
Ultra Champion

Sorry - Silly mistake in my regex (which I have now corrected) try the following. Serves me right for not testing it!

 

rex field=Subject "(?P<number>[^\|]+)\|(?P<id>[^\|]+)\|(?P<ip>[^\|]+)\|(?P<email>[^\|]+)\|(?P<subject>[^$]+)"

If my comment helps, please give it a thumbs up!

View solution in original post

0 Karma

jacqu3sy
Path Finder

Perfect. Works like a charm! Many thanks.

0 Karma

nickhills
Ultra Champion
Thanks don't forgot to upvote too if I helped!
If my comment helps, please give it a thumbs up!
0 Karma

nickhills
Ultra Champion

I'm guessing that you have copied and pasted that example (with redaction) from the event view in search results? (please use the </> code formatter as it helps preserve formatting)

If you are applying regex on the _raw field, you will need to account for the json formatting that was in the original event so your regex might need to begin with something like :

 

"Subject\": ....etc..."

 

 

However, if its well formed json (such that it shows nicely in search results) and you are doing the extraction in a search you can use spath to pull out the fields for you, so then you can apply the regex just to the Subject field.

 

your search...|spath|table Subject

 

from here, assuming the | delimited fields are consistent, a simple rex command should work

 

rex field=Subject "(?P<number>[^\|]+)\|(?P<id>[^\|]+)\|(?P<ip>[^\|]+)\|(?P<email>[^\|]+)\|(?P<subject>[^$]+)"

 

 

If my comment helps, please give it a thumbs up!
0 Karma

jacqu3sy
Path Finder

So yeh, from well formed JSON, trying to run the query from a Search, where the Subject field is being extracted as expected. 

I tried your Regex and it didnt seem to like it. The first field should be the 1 after 'Subject :' and before the first pipe, the second field the message ID in between the first and second pipe etc.

Whenever I tried doing it myself, it kept trying to grab the first character before any of the pipes, kinda ruining things!

0 Karma

nickhills
Ultra Champion

Sorry - Silly mistake in my regex (which I have now corrected) try the following. Serves me right for not testing it!

 

rex field=Subject "(?P<number>[^\|]+)\|(?P<id>[^\|]+)\|(?P<ip>[^\|]+)\|(?P<email>[^\|]+)\|(?P<subject>[^$]+)"

If my comment helps, please give it a thumbs up!
0 Karma
Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...