Splunk Search

Need help understanding how Transform "access-extractions" works

Kozanic
Path Finder

Hi to all that read this, Hoping one of you might be able to provide some assistance.

We have an app that is producing logs using Extended Common web format. Right now the source type we are using is linked to the access-extractions transform, but is not giving all the required fields.

I have tried a number of different approaches to get the required values using regex, but due to the nature of the logs, it feels like I might need a large number of regex entries to capture all variations.

After figuring out that we were using the access-extractions transform, I though a better approach would be to edit this to suit - however I'm still pretty new to regex and not really sure what the regex in this transform is actually doing or how it works.

A sample of the logs we are working with:

10.x.x.x www.blah.au - [20/Aug/2018:08:06:19 +1000] "GET /ebs/picmi/picmirepository.nsf/PICMI?OpenForm&t=PI&k=D&r=http%3A%2F%2Fwww.assediomoral.org%2Findex.php%2Fspip.php%3Farticle106 HTTP/1.1" 200 53245 "http://a.bla.es/?u=https%3A%2F%2Fwww.ebs.tga.gov.au%2Febs%2Fpicmi%2Fpicmirepository.nsf%2FPICMI%3FOpenForm%26t%3DPI%26k%3DD%26r%3Dhttp%253A%252F%252Fwww.assediomoral.org%252Findex.php%252Fspip.php%253Farticle106" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.189 Safari/537.36 Vivaldi/1.95.1077.60" 422 "" "d:/Lotus/Domino/data/ebs/picmi/picmirepository.nsf"

10.x.x.x www.blah.au "107831_67744" [20/Aug/2018:08:06:19 +1000] "GET /ebs/lm/lmdrafts.nsf/xAgentUpdateValidationMonitoring.xsp?documentId=7D35903C63DAEB54CA2582C000426C09&dojo.preventCache=1534716380650 HTTP/1.1" 200 78 "https://www.ebs.tga.gov.au/ebs/LM/LMDrafts.nsf/GenApp.xsp?documentId=7d35903c63daeb54ca2582c000426c09&action=editDocument" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/11.1.2 Safari/605.1.15" 31 "_ga=GA1.3.644697231.1517015993;
_gid=GA1.3.1541615874.1534641115; DomAuthSessId=A004127B4D088BDBD4B14B7E1BF0928B; WelcomeDialogLM=1; SessionID=9E1B7E03146C77042992C7B008ABB7DB303BC2AD" "d:/Lotus/Domino/data/ebs/lm/lmdrafts.nsf"

10.x.x.x www.blah.au - [20/Aug/2018:08:06:15 +1000] "GET /ebs/picmi/picmirepository.nsf/PICMI?OpenForm&t=PI&k=D&r=http%3A%2F%2Fwww2.ogs.state.ny.us%2Fhelp%2Furlstatusgo.html%3Furl%3Dhttp%253A%252F%252Fpedagogie.ac-toulouse.fr%252Feco-golfech%252Fspip.php%253Farticle129 HTTP/1.1" 200 53566 "https://www.apemsa.es/web/guest/analisis-de-agua/-/asset_publisher/7OQq/content/dureza?redirect=https%3A%2F%2Fwww.ebs.tga.gov.au%2Febs%2Fpicmi%2Fpicmirepository.nsf%2FPICMI%3FOpenForm%26t%3DPI%26k%3DD%26r%3Dhttp%253A%252F%252Fwww2.ogs.state.ny.us%252Fhelp%252Furlstatusgo.html%253Furl%253Dhttp%25253A%25252F%25252Fpedagogie.ac-toulouse.fr%25252Feco-golfech%25252Fspip.php%25253Farticle129" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.170 Safari/537.36,gzip(gfe)" 282 "" "d:/Lotus/Domino/data/ebs/picmi/picmirepository.nsf"

 10.x.x.x www.blah.au - [20/Aug/2018:08:06:15 +1000] "GET /ebs/picmi/picmirepository.nsf/PICMI?OpenForm&t=PI&k=D&r=http%3A%2F%2Fwww2.ogs.state.ny.us%2Fhelp%2Furlstatusgo.html%3Furl%3Dhttp%253A%252F%252Fpedagogie.ac-toulouse.fr%252Feco-golfech%252Fspip.php%253Farticle129 HTTP/1.1" 200 53566 "https://www.apemsa.es/web/guest/analisis-de-agua/-/asset_publisher/7OQq/content/dureza?redirect=https%3A%2F%2Fwww.ebs.tga.gov.au%2Febs%2Fpicmi%2Fpicmirepository.nsf%2FPICMI%3FOpenForm%26t%3DPI%26k%3DD%26r%3Dhttp%253A%252F%252Fwww2.ogs.state.ny.us%252Fhelp%252Furlstatusgo.html%253Furl%253Dhttp%25253A%25252F%25252Fpedagogie.ac-toulouse.fr%25252Feco-golfech%25252Fspip.php%25253Farticle129" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.170 Safari/537.36,gzip(gfe)" 282 "" "d:/Lotus/Domino/data/ebs/picmi/picmirepository.nsf"

The particular fields that we are after are the last 3 which represent the time to process, cookie header and translated URL.

Regex from access-extractions:

^[[nspaces:clientip]]\s++[[nspaces:ident]]\s++[[nspaces:user]]\s++[[sbstring:req_time]]\s++[[access-request]]\s++[[nspaces:status]]\s++[[nspaces:bytes]](?:\s++"(?<referer>[[bc_domain:referer_]]?+[^"]*+)"(?:\s++[[qstring:useragent]](?:\s++[[qstring:cookie]])?+)?+)?[[all:other]]

I'm assuming I need to update the last part of this "[[all:other]]" but have tried running this in GUI search box and in regex101, neither seem to be able to work with it so struggling to understand how to update correctly.

0 Karma
1 Solution

Kozanic
Path Finder
0 Karma

Kozanic
Path Finder
0 Karma
Did you miss .conf21 Virtual?

Good news! The event's keynotes and many of its breakout sessions are now available online, and still totally FREE!