Getting Data In

Multicharacter Event Delimiter

jwhughes58
Contributor

All,

I have data that looks like this

event_timestamp | vendor_action | http_method | url | user_dn | src_ip | source | application | | protocol | field_11

Yes, the delimiter is space, pipe, space. The problem is those rare events that have | in the url cause the original regex and delims to put information into the wrong field. I wrote this regex

^(.*)\s\|\s(.*)\s\|\s(.*)\s\|\s(.*)\s\|\s(.*)\s\|\s(.*)\s\|\s(.*)\s\|\s(.*)\s\|\s\|\s(.*)\s\|\s(.*)

and it shows things being extracted properly in the regex tool. When I put it into my laptop Splunk 6.5.2, it doesn’t process. Anyone have a solution for this problem?

TIA
Joe

0 Karma
1 Solution

niketn
Legend

@jwhughes58...You can add the sample data to Splunk's Interactive Field Extractor and then select the unmatched event with URL having required delimiter so that Splunk re-generates the required reg-ex.

Checkout following Splunk Documentation:
http://docs.splunk.com/Documentation/Splunk/latest/Knowledge/ExtractfieldsinteractivelywithIFX
http://docs.splunk.com/Documentation/Splunk/latest/Knowledge/FXSelectFieldsstep

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"

View solution in original post

niketn
Legend

@jwhughes58...You can add the sample data to Splunk's Interactive Field Extractor and then select the unmatched event with URL having required delimiter so that Splunk re-generates the required reg-ex.

Checkout following Splunk Documentation:
http://docs.splunk.com/Documentation/Splunk/latest/Knowledge/ExtractfieldsinteractivelywithIFX
http://docs.splunk.com/Documentation/Splunk/latest/Knowledge/FXSelectFieldsstep

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"

jwhughes58
Contributor

With kudos to @niketnilay this is the final solution to my regex question.

^(\d+/\d+/\d+\s+\d+:\d+:\d+)[^\|\n]*\|\s+([^ ]+)[^\|\n]*\|\s+(\w+)(?:[^ \n]* ){2}([^ ]+)[^\|\n]*\|\s+([^ ]+)\s+\|\s+([^ ]+)[^\|\n]*\|\s+([a-z]+_[a-z]+\d+)[^\|\n]*\|\s+([^ ]+)\s+\|\s+\|\s+(\w+)\s\|\s(.*)

It is ugly as all out, but it works.

0 Karma

niketn
Legend

@jwhughes58... glad it worked 🙂

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma
Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) | New Releases

In November, the Splunk Threat Research Team had one release of new security content via the Enterprise ...

Index This | Divide 100 by half. What do you get?

November 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with this ...

Stay Connected: Your Guide to December Tech Talks, Office Hours, and Webinars!

❄️ Celebrate the season with our December lineup of Community Office Hours, Tech Talks, and Webinars! ...