All Apps and Add-ons

Splunk_TA_tomcat - Extract has regex error?

quihong
Path Finder

Hello,

I'm working on ingesting some JIRA access logs which using Splunk_TA_tomcat. My field extractions aren't working and I'm trying to troubleshoot the issue.

Within the props.conf, I pulled out the regex extraction for the sourcetype tomcat:access:log and pasted it into regex101.com. However, regex101 came back with errors in the regex pattern - "An unescaped delimiter must be escaped with a backslash ()"

from props.conf:

EXTRACT-access = ^(?P<ip>(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)|(?:[A-F0-9]{1,4}:){7}[A-F0-9]{1,4})\s-\s(?P<user>(-|\w+))\s\[\d+/\w+/\d{4}:\d{2}:\d{2}:\d{2}\s[\+\-]\d{4}\]\s"(?P<method>[A-Z]{3,7})\s(?P<request_uri>[\S]+)\s(?P<protocol>[\w/\.]+)"\s(?P<status>\d{3})\s(?P<bytes_sent>(?:\d+|-))$

alt text
alt text

Here is a sample event:

10.2.104.195 576x4475468x5 userName [06/Mar/2018:09:36:20 -0800] "POST /jira/rest/analytics/1.0/publish/bulk HTTP/1.0" 200 - 6 "https://atlassian/jira/browse/ASSIGN-149353" "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.186 Safari/537.36" "etohye"

Wanted to know if the regex is faulty or if regex101 is wrong. Also an help on why my event isn't parsing correctly is appreciated.

0 Karma

8549151046
Engager

I know this answer is a bit old but I stumbled over this when trying to quickly find a simple regex for Tomcat instead of needing to install the TA.

In regards to this question, the regex is slightly incorrect. There are three "/" (forward-slashes) that need to be escaped, see below for the correct syntax. Your log file is also in the incorrect format and is likely from the wrong log source. The regex you shown is for tomcat_localhost_access logs. The log you posted looks to be httpd_access logs. Hopefully this might help someone else that stumbles over this in the future.

^(?P<ip>(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)|(?:[A-F0-9]{1,4}:){7}[A-F0-9]{1,4})\s-\s(?P<user>(-|\w+))\s\[\d+\/\w+\/\d{4}:\d{2}:\d{2}:\d{2}\s[\+\-]\d{4}\]\s"(?P<method>[A-Z]{3,7})\s(?P<request_uri>[\S]+)\s(?P<protocol>[\w\/\.]+)"\s(?P<status>\d{3})\s(?P<bytes_sent>(?:\d+|-))$
0 Karma
Get Updates on the Splunk Community!

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars this month. This ...

They're back! Join the SplunkTrust and MVP at .conf24

With our highly anticipated annual conference, .conf, comes the fez-wearers you can trust! The SplunkTrust, as ...

Enterprise Security Content Update (ESCU) | New Releases

Last month, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...