Splunk Search

Custom Apache log REGEX

kubowler99
New Member

Splunk noob REGEX question.

I'm attempting to customize the REGEX for the ootb Apache extraction. I've got it working for the most part, but I'm unable to get it to parse the referer, useragent, and cookie from the logs. They all get parsed as the 'other' field.

REGEX: ^[[nspaces:clientip]]\s++[[nspaces:dummyip1]]\s++[[nspaces:ident]]\s++[[nspaces:user]]\s++[[sbstring:req_time]]\s++[[access-request]]\s++[[nspaces:status]]\s++[nspaces:bytes]?[[all:other]]

Log Sample:

208.20.251.27 205.141.201.135 - - [03/Jan/2012:09:14:59 -0600] "POST /web/member/webflow.sf HTTP/1.1" 200 7726 "https://oururl.com/web/member/loginWebflow.sf?_flowExecutionKey=_c79820C50-FDCF-6ECA-7444-A686DB586C..." "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; InfoPath.2)" "__utma=242832726.1910715443.1325603689.1325603689.1325603689.1; __utmb=242832726.1.10.1325603689; __utmc=242832726; __utmz=242832726.1325603689.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); mbox=check#true#1325603751|session#1325603690349-869173#1325605551; JSESSIONID=0000LoZUujL4bBUy5HkrMitepvr:16i215taj; BIGipServer_MemberFront_DMZ_WAS_prodmember=2798226893.17440.0000; __utmv=1.DIV%3D%3ASEG%3D; __utma=1.1819453781.1325603691.1325603691.1325603691.1; __utmb=1; __utmc=1; __utmz=1.1325603691.1.1.utmccn=(direct)|utmcsr=(direct)|utmcmd=(none); remainingTime=null"

Any assistance would be greatly appreciated.

Tags (2)
0 Karma

RubenOlsen
Path Finder

Given your example data is in the form of key=value, you probably do not need to create field extractions for these values as Splunk will do this automatically.

However, if you can amend the logging of the utm-keys to enclose the values with " (i.e. utma="242832726.1910715443.1325603689.1325603689.1325603689.1") - then you really do not need to create any explicit field extractions as this will ensure that the complete value is used.

0 Karma

lguinn2
Legend

When you set up the File or Directory input, under More Settings, using the "choose from list" option or the "manual" option to set the source type. I suggest that you use

access_combined_wcookie

This is a pre-existing Splunk sourcetype, with field extractions for Apache data. I have used access_combined a bunch, and it definitely extracts the referer and useragent fields (along with clientip, status, uri, etc.)
There are three built-in choices for Apache: access_combined_wcookie, access_combined and access_common.
Even if none of these is an exact match, you can set the sourcetype to the best fit - and then just do the additional field extractions that you need.

0 Karma
Get Updates on the Splunk Community!

Customer Experience | Join the Customer Advisory Board!

Are you ready to take your Splunk journey to the next level? 🚀 We invite you to join our elite squad ...

Observability Cloud | AWS PrivateLink Enabled for Splunk Observability Cloud

We’ve enabled AWS PrivateLink for Observability Cloud, giving you an additional inbound connection to send ...

Index This | A sphere has three, a circle has two, and a point has zero. What is it?

September 2023 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...