Splunk Enterprise

Regex to extract fields between pipe

jacqu3sy
Path Finder

Hi,

I'm indexing events in JSON format and I need a way of extracting into individual fields the values broken up by the pipe in the 'Subject' field seen below;

RecipientAddress: bla@bla.com
SenderAddress: fred@fred.com
Size: 201828
Status: FilteredAsSpam
Subject: 1|fdbe21c9-xxxxx|195.168.1.1|Comms@fred.com|([Ext]Hi, join us for the 10-year roundup) 12/11/2020 8:21:14 AM
ToIP: null

I seem to be struggling to get a regex to work, not sure whether I need to take into account the JSON formatting?

Thanks.

Labels (1)
0 Karma
1 Solution

nickhills
Ultra Champion

Sorry - Silly mistake in my regex (which I have now corrected) try the following. Serves me right for not testing it!

 

rex field=Subject "(?P<number>[^\|]+)\|(?P<id>[^\|]+)\|(?P<ip>[^\|]+)\|(?P<email>[^\|]+)\|(?P<subject>[^$]+)"

If my comment helps, please give it a thumbs up!

View solution in original post

0 Karma

jacqu3sy
Path Finder

Perfect. Works like a charm! Many thanks.

0 Karma

nickhills
Ultra Champion
Thanks don't forgot to upvote too if I helped!
If my comment helps, please give it a thumbs up!
0 Karma

nickhills
Ultra Champion

I'm guessing that you have copied and pasted that example (with redaction) from the event view in search results? (please use the </> code formatter as it helps preserve formatting)

If you are applying regex on the _raw field, you will need to account for the json formatting that was in the original event so your regex might need to begin with something like :

 

"Subject\": ....etc..."

 

 

However, if its well formed json (such that it shows nicely in search results) and you are doing the extraction in a search you can use spath to pull out the fields for you, so then you can apply the regex just to the Subject field.

 

your search...|spath|table Subject

 

from here, assuming the | delimited fields are consistent, a simple rex command should work

 

rex field=Subject "(?P<number>[^\|]+)\|(?P<id>[^\|]+)\|(?P<ip>[^\|]+)\|(?P<email>[^\|]+)\|(?P<subject>[^$]+)"

 

 

If my comment helps, please give it a thumbs up!
0 Karma

jacqu3sy
Path Finder

So yeh, from well formed JSON, trying to run the query from a Search, where the Subject field is being extracted as expected. 

I tried your Regex and it didnt seem to like it. The first field should be the 1 after 'Subject :' and before the first pipe, the second field the message ID in between the first and second pipe etc.

Whenever I tried doing it myself, it kept trying to grab the first character before any of the pipes, kinda ruining things!

0 Karma

nickhills
Ultra Champion

Sorry - Silly mistake in my regex (which I have now corrected) try the following. Serves me right for not testing it!

 

rex field=Subject "(?P<number>[^\|]+)\|(?P<id>[^\|]+)\|(?P<ip>[^\|]+)\|(?P<email>[^\|]+)\|(?P<subject>[^$]+)"

If my comment helps, please give it a thumbs up!
0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...