Splunk Search

How to edit my single regex for parsing multiple types of events in the same sourcetype?

Builder

Hi All,

I want a single regex for multiple types of events getting generated in my access logs. I have written the following regex for extracting fields from my access.log :

^(?P[^ ]+)\s+(?P[^ ]+)\s+(?P[^ ]+)\s+\[(?P[^\]]+)[^ \n]*\s+url="(?P[^ ]+)\s+(?P[^ ]+)\s+(?P[^"]+)"\s+\|status=(?P[^ ]+)\s+\|size=(?P[^ ]+)\s+\|resp_time=(?P[^ ]+)\|\sreferer="(?P[^"]+)"\suser_agent="(?P[^"]+)"

The problem is most of the events are getting matched to this regex, but there are 4 events which are showing as "non-matches". Both of my events are :

Matching :

10.0.0.76 - - [20/Apr/2016:16:41:50 +0000] url="GET /dh/en-US/account/login HTTP/1.1" |status=302 |size=323 |resp_time=176| referer="-" user_agent="Python-httplib2/0.7.0 (gzip)"

Non matching is :

37.28.152.58 - - [19/Apr/2016:20:51:58 +0000] url="myversion|3.6 Public" |status=400 |size=312 |resp_time=125| referer="-" user_agent="-" 

106.184.4.52 - - [17/Apr/2016:18:19:27 +0000] url="SSH-2.0-LYGhost_1.2.7-20100630" |status=302 |size=299 |resp_time=129| referer="-" user_agent="-"

In the above non-matching event - fields http_method & protocol are not there. Is there a way to write some conditions in the regex so that using the above regex should work with both the events? Please help.

Thanks
PG

0 Karma

Legend

You can have multiple field extractions for the same sourcetype, no problem. For each event, Splunk will attempt to apply all the regular expressions, and will use all of them that match.

Also, regular expressions in Splunk are unanchored.

Finally if you really want to use such a complex regular expression, I suggest that you use a regular expression tool to test it thoroughly.

You might also want to read the manual entries for creating and maintaining search time field etxractions.

Legend

Why must this be a single regex?

This is hard to understand, and that makes it fragile and hard to maintain - not to mention hard to get right in the first place!

0 Karma

Builder

So how this can be achieved. Can you please guide me. Actually both the events are from the same sourcetype. So how can we write different regex for different events ? and apply it ?

0 Karma

Legend

It will also make it hard to troubleshoot.

0 Karma

Builder

somehow the regex I copied above is not showing capture group name. Pasting the regex again :

^(?P[^ ]+)\s+(?P[^ ]+)\s+(?P[^ ]+)\s+\[(?P[^\]]+)[^ \n]*\s+url="(?P[^ ]+)\s+(?P[^ ]+)\s+(?P[^"]+)"\s+\|status=(?P[^ ]+)\s+\|size=(?P[^ ]+)\s+\|resp_time=(?P[^ ]+)\|\sreferer="(?P[^"]+)"\suser_agent="(?P[^"]+)"\siPlanetDirectoryPro="(?P[^"]+)"
0 Karma
Don’t Miss Global Splunk
User Groups Week!

Free LIVE events worldwide 2/8-2/12
Connect, learn, and collect rad prizes
and swag!