Splunk Search

How do I handle fields with no value or a blank space in a rex field extraction so they show up as null?

jedatt01
Builder

I have a data source that is pipe delimited, but some of the fields contain no data or even a blank space. I've created a regex expression that is able to extract all the fields, but is there a way to have the fields with no value or a blank space show up as null so I can handle them with fillnull? Currently the show up as blank such as event="" or with a space like subject=" "

Raw data:

2016-02-18 09:35:09,491|OAuth| USER1| 1.1.1.1 | |USER1| OAuth20| pingfedserver4| AS| success| | | 18 
2016-02-18 09:35:09,267|SSO| USER2| 2.2.2.2 | | https://testsite.test.com| SAML20| pingfedserver4| IdP| success| testuser| | 92 

My rex statement:

rex "\d*\:\d*:\d*\,\d*\|(?<event>[^\|]*)\|(?<subject>[^\|]*)\|(?<ip>[^\|]*)\|(?<app>[^\|]*)\|(?<connectionid>[^\|]*)\|(?<protocol> [^\|]*)\|(?<host>[^\|]*)\|(?<role>[^\|]*)\|(?<status>[^\|]*)\|(?<adapterid>[^\|]*)\|(?<description>[^\|]*)\|(?<responsetime>[^\|]*)"
0 Karma
1 Solution

somesoni2
Revered Legend

Try like this

Your base search | rex "^[^\|]+\|(?<event>[^\|]*)\|\s(?<subject>[^\|]*)\|\s(?<ip>[^\|]*)\|\s(?<app>[^\|]*)\|\s(?<connectionid>[^\|]*)\|\s(?<protocol>[^\|]*)\|\s(?<host>[^\|]*)\|\s(?<role>[^\|]*)\|\s(?<status>[^\|]*)\|\s(?<adapterid>[^\|]*)\|\s(?<description>[^\|]*)\|\s(?<responsetime>.\d*)"  | replace ""  with "NA"

View solution in original post

somesoni2
Revered Legend

Try like this

Your base search | rex "^[^\|]+\|(?<event>[^\|]*)\|\s(?<subject>[^\|]*)\|\s(?<ip>[^\|]*)\|\s(?<app>[^\|]*)\|\s(?<connectionid>[^\|]*)\|\s(?<protocol>[^\|]*)\|\s(?<host>[^\|]*)\|\s(?<role>[^\|]*)\|\s(?<status>[^\|]*)\|\s(?<adapterid>[^\|]*)\|\s(?<description>[^\|]*)\|\s(?<responsetime>.\d*)"  | replace ""  with "NA"

jedatt01
Builder

Worked perfectly! I didn't know about the replace command.

0 Karma

javiergn
Super Champion

Hi,

I've noticed a typo in your regex. There's a space after the protocol label that it shouldn't be there.

Anyway, with regards to your questions, you could combine foreach with trim and isnull and it should give you what you are looking for:

| foreach * [eval <<FIELD>> = if(isnull(trim('<<FIELD>>')), "NULL", trim('<<FIELD>>'))]

Let me know if that helps

0 Karma

jedatt01
Builder

That didn't quite handle it, but I was able to get of rid of the spaces by changing my rex command a bit.

rex "\d*\:\d*:\d*\,\d*\|(?<event>[^\|]*)\|\s(?<subject>[^\|]*)\|\s(?<ip>[^\|]*)\|\s(?<app>[^\|]*)\|\s(?<connectionid>[^\|]*)\|\s(?<protocol>[^\|]*)\|\s(?<host>[^\|]*)\|\s(?<role>[^\|]*)\|\s(?<status>[^\|]*)\|\s(?<adapterid>[^\|]*)\|\s(?<description>[^\|]*)\|\s(?<responsetime>.\d*)"

Now I just need to be able to handle where myfield=""

Any ideas?

0 Karma
Get Updates on the Splunk Community!

Observability Unlocked: Kubernetes Monitoring with Splunk Observability Cloud

  Ready to master Kubernetes and cloud monitoring like the pros?Join Splunk’s Growth Engineering team for an ...

Wrapping Up Cybersecurity Awareness Month

October might be wrapping up, but for Splunk Education, cybersecurity awareness never goes out of season. ...

🌟 From Audit Chaos to Clarity: Welcoming Audit Trail v2

&#x1f5e3; You Spoke, We Listened  Audit Trail v2 wasn’t written in isolation—it was shaped by your voices.  In ...