We use a custom access log format which, as far as I can tell, matches the access-extractions
except has a preceding IP.
By copying the access-extractions
, I assumed that this is what I need in my transforms.conf:
[custom_access-extractions]
REGEX = ^[[nspaces:true_client_ip]]\s++[[nspaces:clientip]]\s++[[nspaces:ident]]\s++[[nspaces:user]]\s++[[sbstring:req_time]]\s++[[access-request]]\s++[[nspaces:status]]\s++[[nspaces:bytes]](?:\s++"(?<referer>[[bc_domain:referer_]]?+[^"]*+)"(?:\s++[[qstring:useragent]](?:\s++[[qstring:cookie]])?+)?+)?[[all:other]]
Notice the preceding true_client_ip
.
When I step through this with rex on the Web UI search box, it seems to fall apart after the 'bytes'. I tried escaping the quotes and removing the (?:
.
I'm guessing someone else has run into this already...So...anyone know what I'm doing wrong such that this isn't working?
On the command line, Splunk sometimes chokes when there are quotation marks in the regular expression. Not because of the regular expression, but because the command line parser is just not that smart.
Put a backslash (\
) in front of the quotation marks, and see if that helps. And it shouldn't hurt the regular expression, either. Also, you probably do need the (?:
- otherwise Splunk may think that you want to create a capture group - the ?:
means that you don't want to capture. If you tried each of these edits independently, then sorry for repeating what you have already done...
Or avoid the command line: set up the input on a test instance somewhere. This is probably best.
Just a suggestion - I would not mix dashes (-) and underscores (_) in the same sourcetype name (custom_access-extractions
). It tends to confuse some people (me) a lot.
On the command line, Splunk sometimes chokes when there are quotation marks in the regular expression. Not because of the regular expression, but because the command line parser is just not that smart.
Put a backslash (\
) in front of the quotation marks, and see if that helps. And it shouldn't hurt the regular expression, either. Also, you probably do need the (?:
- otherwise Splunk may think that you want to create a capture group - the ?:
means that you don't want to capture. If you tried each of these edits independently, then sorry for repeating what you have already done...
Or avoid the command line: set up the input on a test instance somewhere. This is probably best.
Just a suggestion - I would not mix dashes (-) and underscores (_) in the same sourcetype name (custom_access-extractions
). It tends to confuse some people (me) a lot.
Thanks - yea, that's what I had done but it wasn't working. Regardless, I finally got it regardless once I noticed an extra field in the middle of the log which was causing my failures.
Funny, I should have said "Splunk search box", too. Same comments apply for the search box. I looked it up, and you should escape quotation marks in the search box by using the backslash, as in \"
Thanks for the tips.
When I mentioned "I tried escaping the quotes..." I meant I tried putting a backlash in front of them.
When I said "command line" I mean the splunk search box in the web UI. Not sure why I put the wrong thing. 😞