Splunk Search

Search time field extraction not showing in available fields

Engager

I am attempting to perform a search time field extraction via the rex command. I use the default field of _raw and give it a regex with named groups. None of my named groups are showing up as an available field to select from.

Essentially, I am parsing a custom apache access log:

An example of a line of data is:

9.999.999.999 9.999.999.9 xxxxxxxx  [17/Jun/2014:23:11:43 -0400] "GET /someapp/css/windows/default.css HTTP/1.1" 200 767 "protocol://www.ourserver.com/someapp/some.jsp?param=1&param2=a" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; .NET4.0C)"

The search I use is:

source=/issue.log| rex "(?:[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+, )?(?<forwardedforip>[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+|\-) (?<remoteip>[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+) (?<userid>\S+|\-)[ ]+\[(?<day>\d+)/(?<month>\w+)/(?<year>\d+):(?<hour>\d+):(?<minute>\d+):(?<second>\d+) (<?timezone>-\d+)] \"(?<action>\w+) (?<url>.*?)(?<parameters>\?.*?)? (?<httpversion>\S+)\" (?<httpstatus>\d+) (?<responsesize>\d+|\-) \"(?<refererurl>.*?)\" \"(?<useragent>.*?)\""

Any ideas why my named groups are not showing up? This regex works without the named groups in regex testing apps. I just cannot get it to be recognized by Splunk.

thanks!

Tags (2)
0 Karma
1 Solution

SplunkTrust
SplunkTrust

Try this

I believe you just misplaced one '?' for the timezone field extraction. Remaining thing works.

source=/issue.log | rex "(?:[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+, )?(?<forwardedforip>[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+|\-) (?<remoteip>[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+) (?<userid>\S+|\-)[ ]+\[(?<day>\d+)/(?<month>\w+)/(?<year>\d+):(?<hour>\d+):(?<minute>\d+):(?<second>\d+) (?<timezone>-\d+)] \"(?<action>\w+) (?<url>.*?)(?<parameters>\?.*?)? (?<httpversion>\S+)\" (?<httpstatus>\d+) (?<responsesize>\d+|\-) \"(?<refererurl>.*?)\" \"(?<useragent>.*?)\""

View solution in original post

SplunkTrust
SplunkTrust

Try this

I believe you just misplaced one '?' for the timezone field extraction. Remaining thing works.

source=/issue.log | rex "(?:[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+, )?(?<forwardedforip>[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+|\-) (?<remoteip>[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+) (?<userid>\S+|\-)[ ]+\[(?<day>\d+)/(?<month>\w+)/(?<year>\d+):(?<hour>\d+):(?<minute>\d+):(?<second>\d+) (?<timezone>-\d+)] \"(?<action>\w+) (?<url>.*?)(?<parameters>\?.*?)? (?<httpversion>\S+)\" (?<httpstatus>\d+) (?<responsesize>\d+|\-) \"(?<refererurl>.*?)\" \"(?<useragent>.*?)\""

View solution in original post

Engager

That did it. You know, I looked at this over and over thinking it was something like this and kept missing it.

Thank you!

0 Karma

Engager

However, if I look at a specific field, Apache_Request, it works!

source=/issue.log| rex field="Apache_Request" "(?<action>\w+) (?<url>.*?)(?<parameters>\?.*?)? (?<httpversion>\S+)"
0 Karma