Splunk Search

Custom Apache log REGEX

kubowler99
New Member

Splunk noob REGEX question.

I'm attempting to customize the REGEX for the ootb Apache extraction. I've got it working for the most part, but I'm unable to get it to parse the referer, useragent, and cookie from the logs. They all get parsed as the 'other' field.

REGEX: ^[[nspaces:clientip]]\s++[[nspaces:dummyip1]]\s++[[nspaces:ident]]\s++[[nspaces:user]]\s++[[sbstring:req_time]]\s++[[access-request]]\s++[[nspaces:status]]\s++[nspaces:bytes]?[[all:other]]

Log Sample:

208.20.251.27 205.141.201.135 - - [03/Jan/2012:09:14:59 -0600] "POST /web/member/webflow.sf HTTP/1.1" 200 7726 "https://oururl.com/web/member/loginWebflow.sf?_flowExecutionKey=_c79820C50-FDCF-6ECA-7444-A686DB586C..." "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; InfoPath.2)" "__utma=242832726.1910715443.1325603689.1325603689.1325603689.1; __utmb=242832726.1.10.1325603689; __utmc=242832726; __utmz=242832726.1325603689.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); mbox=check#true#1325603751|session#1325603690349-869173#1325605551; JSESSIONID=0000LoZUujL4bBUy5HkrMitepvr:16i215taj; BIGipServer_MemberFront_DMZ_WAS_prodmember=2798226893.17440.0000; __utmv=1.DIV%3D%3ASEG%3D; __utma=1.1819453781.1325603691.1325603691.1325603691.1; __utmb=1; __utmc=1; __utmz=1.1325603691.1.1.utmccn=(direct)|utmcsr=(direct)|utmcmd=(none); remainingTime=null"

Any assistance would be greatly appreciated.

Tags (2)
0 Karma

RubenOlsen
Path Finder

Given your example data is in the form of key=value, you probably do not need to create field extractions for these values as Splunk will do this automatically.

However, if you can amend the logging of the utm-keys to enclose the values with " (i.e. utma="242832726.1910715443.1325603689.1325603689.1325603689.1") - then you really do not need to create any explicit field extractions as this will ensure that the complete value is used.

0 Karma

lguinn2
Legend

When you set up the File or Directory input, under More Settings, using the "choose from list" option or the "manual" option to set the source type. I suggest that you use

access_combined_wcookie

This is a pre-existing Splunk sourcetype, with field extractions for Apache data. I have used access_combined a bunch, and it definitely extracts the referer and useragent fields (along with clientip, status, uri, etc.)
There are three built-in choices for Apache: access_combined_wcookie, access_combined and access_common.
Even if none of these is an exact match, you can set the sourcetype to the best fit - and then just do the additional field extractions that you need.

0 Karma
Get Updates on the Splunk Community!

Splunk Observability Cloud | Unified Identity - Now Available for Existing Splunk ...

Raise your hand if you’ve already forgotten your username or password when logging into an account. (We can’t ...

Index This | How many sides does a circle have?

February 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

Registration for Splunk University is Now Open!

Are you ready for an adventure in learning?   Brace yourselves because Splunk University is back, and it's ...