A sample row that I want to parse:
<134>Feb 2 07:06:48 github-intuit-com github_access: 10.168.0.5 - - [02/Feb/2015:07:06:48 -0800] "GET /_sockets/NDpjMDc2NTQ2ZjdjNjZmNzI2NmUxMjZkNDIzNmY3M2IxZTo0ZGYyYTQzZmIxZWFmMjVmZjViZjJjYjY0M2U3OGRiMzQzYjk0NGRlNjk0M2MzNjViNWFkN2FlOTViNTE4Zjhh--eed5cdc3ed46a7fb25bcaae2f1a1f68b48758e4c HTTP/1.1" 200 13 "https://github.intuit.com/CTO-FDS-AppOps/chef-repo-prod/pull/960" "Mozilla/5.0 (X11; Linux x86_64; rv:35.0) Gecko/20100101 Firefox/35.0" "172.17.109.193" 0.012 0.012 .
tried taking the access_extraction from the default transforms.conf, but the extract fields page says it’s an invalid regex with multiple repeats.
How do I parse this line? I really need the numbers at the end: the ip address, and the two numbers after it, as the two numbers after it are response times.
From your example:
rex " \"(?<ip>(?:\d{1,3}\.){3}\d{1,3})\"\s(?<first>\d+\.\d+)\s(?<second>\d+\.\d+)\s\.$"
Let's break it down.
\" match literal double quote
(? create new capture group named ip
(?: create new non capture group
\d{1,3}\.){3} match 3 occurances of 1 to 3 digits followed by a literal .
\d{1,3} match one occurance of 1 to 3 digits
\"\s match literal double quote and space
(? create new capture group named first
\d+\.\d+) capture one or more digits followed by a literal . followed by one or more digits
\s match literal space
(?\d+\.\d+) same as before
\s\.$ match literal space, literal . and end of line $
From your example:
rex " \"(?<ip>(?:\d{1,3}\.){3}\d{1,3})\"\s(?<first>\d+\.\d+)\s(?<second>\d+\.\d+)\s\.$"
Let's break it down.
\" match literal double quote
(? create new capture group named ip
(?: create new non capture group
\d{1,3}\.){3} match 3 occurances of 1 to 3 digits followed by a literal .
\d{1,3} match one occurance of 1 to 3 digits
\"\s match literal double quote and space
(? create new capture group named first
\d+\.\d+) capture one or more digits followed by a literal . followed by one or more digits
\s match literal space
(?\d+\.\d+) same as before
\s\.$ match literal space, literal . and end of line $
Tried this but it did not work
(?i)(?P\w+\s+\d+\s+\d+:\d+:\d+)\s+(?P[^ ]+) (?P[^:]+):\s+(?P\d*.\d*.\d*.\d*)\s+(?P.[^ ])\s+(?P.[^ ])\s+(?P[.])\s+"(?P.)”