Splunk Search

Need help understanding how Transform "access-extractions" works

Kozanic
Path Finder

Hi to all that read this, Hoping one of you might be able to provide some assistance.

We have an app that is producing logs using Extended Common web format. Right now the source type we are using is linked to the access-extractions transform, but is not giving all the required fields.

I have tried a number of different approaches to get the required values using regex, but due to the nature of the logs, it feels like I might need a large number of regex entries to capture all variations.

After figuring out that we were using the access-extractions transform, I though a better approach would be to edit this to suit - however I'm still pretty new to regex and not really sure what the regex in this transform is actually doing or how it works.

A sample of the logs we are working with:

10.x.x.x www.blah.au - [20/Aug/2018:08:06:19 +1000] "GET /ebs/picmi/picmirepository.nsf/PICMI?OpenForm&t=PI&k=D&r=http%3A%2F%2Fwww.assediomoral.org%2Findex.php%2Fspip.php%3Farticle106 HTTP/1.1" 200 53245 "http://a.bla.es/?u=https%3A%2F%2Fwww.ebs.tga.gov.au%2Febs%2Fpicmi%2Fpicmirepository.nsf%2FPICMI%3FOpenForm%26t%3DPI%26k%3DD%26r%3Dhttp%253A%252F%252Fwww.assediomoral.org%252Findex.php%252Fspip.php%253Farticle106" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.189 Safari/537.36 Vivaldi/1.95.1077.60" 422 "" "d:/Lotus/Domino/data/ebs/picmi/picmirepository.nsf"

10.x.x.x www.blah.au "107831_67744" [20/Aug/2018:08:06:19 +1000] "GET /ebs/lm/lmdrafts.nsf/xAgentUpdateValidationMonitoring.xsp?documentId=7D35903C63DAEB54CA2582C000426C09&dojo.preventCache=1534716380650 HTTP/1.1" 200 78 "https://www.ebs.tga.gov.au/ebs/LM/LMDrafts.nsf/GenApp.xsp?documentId=7d35903c63daeb54ca2582c000426c09&action=editDocument" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/11.1.2 Safari/605.1.15" 31 "_ga=GA1.3.644697231.1517015993;
_gid=GA1.3.1541615874.1534641115; DomAuthSessId=A004127B4D088BDBD4B14B7E1BF0928B; WelcomeDialogLM=1; SessionID=9E1B7E03146C77042992C7B008ABB7DB303BC2AD" "d:/Lotus/Domino/data/ebs/lm/lmdrafts.nsf"

10.x.x.x www.blah.au - [20/Aug/2018:08:06:15 +1000] "GET /ebs/picmi/picmirepository.nsf/PICMI?OpenForm&t=PI&k=D&r=http%3A%2F%2Fwww2.ogs.state.ny.us%2Fhelp%2Furlstatusgo.html%3Furl%3Dhttp%253A%252F%252Fpedagogie.ac-toulouse.fr%252Feco-golfech%252Fspip.php%253Farticle129 HTTP/1.1" 200 53566 "https://www.apemsa.es/web/guest/analisis-de-agua/-/asset_publisher/7OQq/content/dureza?redirect=https%3A%2F%2Fwww.ebs.tga.gov.au%2Febs%2Fpicmi%2Fpicmirepository.nsf%2FPICMI%3FOpenForm%26t%3DPI%26k%3DD%26r%3Dhttp%253A%252F%252Fwww2.ogs.state.ny.us%252Fhelp%252Furlstatusgo.html%253Furl%253Dhttp%25253A%25252F%25252Fpedagogie.ac-toulouse.fr%25252Feco-golfech%25252Fspip.php%25253Farticle129" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.170 Safari/537.36,gzip(gfe)" 282 "" "d:/Lotus/Domino/data/ebs/picmi/picmirepository.nsf"

 10.x.x.x www.blah.au - [20/Aug/2018:08:06:15 +1000] "GET /ebs/picmi/picmirepository.nsf/PICMI?OpenForm&t=PI&k=D&r=http%3A%2F%2Fwww2.ogs.state.ny.us%2Fhelp%2Furlstatusgo.html%3Furl%3Dhttp%253A%252F%252Fpedagogie.ac-toulouse.fr%252Feco-golfech%252Fspip.php%253Farticle129 HTTP/1.1" 200 53566 "https://www.apemsa.es/web/guest/analisis-de-agua/-/asset_publisher/7OQq/content/dureza?redirect=https%3A%2F%2Fwww.ebs.tga.gov.au%2Febs%2Fpicmi%2Fpicmirepository.nsf%2FPICMI%3FOpenForm%26t%3DPI%26k%3DD%26r%3Dhttp%253A%252F%252Fwww2.ogs.state.ny.us%252Fhelp%252Furlstatusgo.html%253Furl%253Dhttp%25253A%25252F%25252Fpedagogie.ac-toulouse.fr%25252Feco-golfech%25252Fspip.php%25253Farticle129" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.170 Safari/537.36,gzip(gfe)" 282 "" "d:/Lotus/Domino/data/ebs/picmi/picmirepository.nsf"

The particular fields that we are after are the last 3 which represent the time to process, cookie header and translated URL.

Regex from access-extractions:

^[[nspaces:clientip]]\s++[[nspaces:ident]]\s++[[nspaces:user]]\s++[[sbstring:req_time]]\s++[[access-request]]\s++[[nspaces:status]]\s++[[nspaces:bytes]](?:\s++"(?<referer>[[bc_domain:referer_]]?+[^"]*+)"(?:\s++[[qstring:useragent]](?:\s++[[qstring:cookie]])?+)?+)?[[all:other]]

I'm assuming I need to update the last part of this "[[all:other]]" but have tried running this in GUI search box and in regex101, neither seem to be able to work with it so struggling to understand how to update correctly.

0 Karma
1 Solution

Kozanic
Path Finder
0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...