A sample row that I want to parse:
<134>Feb 2 07:06:48 github-intuit-com github_access: 10.168.0.5 - - [02/Feb/2015:07:06:48 -0800] "GET /_sockets/NDpjMDc2NTQ2ZjdjNjZmNzI2NmUxMjZkNDIzNmY3M2IxZTo0ZGYyYTQzZmIxZWFmMjVmZjViZjJjYjY0M2U3OGRiMzQzYjk0NGRlNjk0M2MzNjViNWFkN2FlOTViNTE4Zjhh--eed5cdc3ed46a7fb25bcaae2f1a1f68b48758e4c HTTP/1.1" 200 13 "https://github.intuit.com/CTO-FDS-AppOps/chef-repo-prod/pull/960" "Mozilla/5.0 (X11; Linux x86_64; rv:35.0) Gecko/20100101 Firefox/35.0" "172.17.109.193" 0.012 0.012 .
tried taking the access_extraction from the default transforms.conf, but the extract fields page says it’s an invalid regex with multiple repeats.
How do I parse this line? I really need the numbers at the end: the ip address, and the two numbers after it, as the two numbers after it are response times.
From your example:
rex " \"(?<ip>(?:\d{1,3}\.){3}\d{1,3})\"\s(?<first>\d+\.\d+)\s(?<second>\d+\.\d+)\s\.$"
Let's break it down.
\"
match literal double quote
(?
create new capture group named ip
(?:
create new non capture group
\d{1,3}\.){3}
match 3 occurances of 1 to 3 digits followed by a literal .
\d{1,3}
match one occurance of 1 to 3 digits
\"\s
match literal double quote and space
(?
create new capture group named first
\d+\.\d+)
capture one or more digits followed by a literal .
followed by one or more digits
\s
match literal space
(?\d+\.\d+)
same as before
\s\.$
match literal space, literal .
and end of line $
From your example:
rex " \"(?<ip>(?:\d{1,3}\.){3}\d{1,3})\"\s(?<first>\d+\.\d+)\s(?<second>\d+\.\d+)\s\.$"
Let's break it down.
\"
match literal double quote
(?
create new capture group named ip
(?:
create new non capture group
\d{1,3}\.){3}
match 3 occurances of 1 to 3 digits followed by a literal .
\d{1,3}
match one occurance of 1 to 3 digits
\"\s
match literal double quote and space
(?
create new capture group named first
\d+\.\d+)
capture one or more digits followed by a literal .
followed by one or more digits
\s
match literal space
(?\d+\.\d+)
same as before
\s\.$
match literal space, literal .
and end of line $
Tried this but it did not work
(?i)(?P\w+\s+\d+\s+\d+:\d+:\d+)\s+(?P[^ ]+) (?P[^:]+):\s+(?P\d*.\d*.\d*.\d*)\s+(?P.[^ ])\s+(?P.[^ ])\s+(?P[.])\s+"(?P.)”