Splunk Search

How do I handle fields with no value or a blank space in a rex field extraction so they show up as null?

jedatt01
Builder

I have a data source that is pipe delimited, but some of the fields contain no data or even a blank space. I've created a regex expression that is able to extract all the fields, but is there a way to have the fields with no value or a blank space show up as null so I can handle them with fillnull? Currently the show up as blank such as event="" or with a space like subject=" "

Raw data:

2016-02-18 09:35:09,491|OAuth| USER1| 1.1.1.1 | |USER1| OAuth20| pingfedserver4| AS| success| | | 18 
2016-02-18 09:35:09,267|SSO| USER2| 2.2.2.2 | | https://testsite.test.com| SAML20| pingfedserver4| IdP| success| testuser| | 92 

My rex statement:

rex "\d*\:\d*:\d*\,\d*\|(?<event>[^\|]*)\|(?<subject>[^\|]*)\|(?<ip>[^\|]*)\|(?<app>[^\|]*)\|(?<connectionid>[^\|]*)\|(?<protocol> [^\|]*)\|(?<host>[^\|]*)\|(?<role>[^\|]*)\|(?<status>[^\|]*)\|(?<adapterid>[^\|]*)\|(?<description>[^\|]*)\|(?<responsetime>[^\|]*)"
0 Karma
1 Solution

somesoni2
Revered Legend

Try like this

Your base search | rex "^[^\|]+\|(?<event>[^\|]*)\|\s(?<subject>[^\|]*)\|\s(?<ip>[^\|]*)\|\s(?<app>[^\|]*)\|\s(?<connectionid>[^\|]*)\|\s(?<protocol>[^\|]*)\|\s(?<host>[^\|]*)\|\s(?<role>[^\|]*)\|\s(?<status>[^\|]*)\|\s(?<adapterid>[^\|]*)\|\s(?<description>[^\|]*)\|\s(?<responsetime>.\d*)"  | replace ""  with "NA"

View solution in original post

somesoni2
Revered Legend

Try like this

Your base search | rex "^[^\|]+\|(?<event>[^\|]*)\|\s(?<subject>[^\|]*)\|\s(?<ip>[^\|]*)\|\s(?<app>[^\|]*)\|\s(?<connectionid>[^\|]*)\|\s(?<protocol>[^\|]*)\|\s(?<host>[^\|]*)\|\s(?<role>[^\|]*)\|\s(?<status>[^\|]*)\|\s(?<adapterid>[^\|]*)\|\s(?<description>[^\|]*)\|\s(?<responsetime>.\d*)"  | replace ""  with "NA"

jedatt01
Builder

Worked perfectly! I didn't know about the replace command.

0 Karma

javiergn
Super Champion

Hi,

I've noticed a typo in your regex. There's a space after the protocol label that it shouldn't be there.

Anyway, with regards to your questions, you could combine foreach with trim and isnull and it should give you what you are looking for:

| foreach * [eval <<FIELD>> = if(isnull(trim('<<FIELD>>')), "NULL", trim('<<FIELD>>'))]

Let me know if that helps

0 Karma

jedatt01
Builder

That didn't quite handle it, but I was able to get of rid of the spaces by changing my rex command a bit.

rex "\d*\:\d*:\d*\,\d*\|(?<event>[^\|]*)\|\s(?<subject>[^\|]*)\|\s(?<ip>[^\|]*)\|\s(?<app>[^\|]*)\|\s(?<connectionid>[^\|]*)\|\s(?<protocol>[^\|]*)\|\s(?<host>[^\|]*)\|\s(?<role>[^\|]*)\|\s(?<status>[^\|]*)\|\s(?<adapterid>[^\|]*)\|\s(?<description>[^\|]*)\|\s(?<responsetime>.\d*)"

Now I just need to be able to handle where myfield=""

Any ideas?

0 Karma
Get Updates on the Splunk Community!

Developer Spotlight with Brett Adams

In our third Spotlight feature, we're excited to shine a light on Brett—a Splunk consultant, innovative ...

Index This | What can you do to make 55,555 equal 500?

April 2025 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with this ...

Say goodbye to manually analyzing phishing and malware threats with Splunk Attack ...

In today’s evolving threat landscape, we understand you’re constantly bombarded with phishing and malware ...