Splunk Dev

Can you help me extract fields from apache:access logs?

mrtolu6
Path Finder

Regex Experts!
Need help in extracting src, http_method, uri_path, status field.

Below is an example of a log with the fields that I would like to extract :

"10.10.10.22 - - [12/Oct/2012:14:22:41 -0400] "GET /etc/team/transport/tRoom?serlet=jpsSSGenerator HTTP/1.1" 200 26494"
src=10.10.10.22, http_method=GET, uri_path= /etc/team/transport/Room?serlet=jpsSSGenerator

This is example of different types of logs that comes from apache access logs. I'm looking for a regex that can extract fields from the example below. Thanks in advance for any help.

example logs

127.0.0.1 - - [104/Oct/2018:11:22:47 -0700] "GET /directory/directory/test?seat=ShowPage&tese=calendar.js&IP=444.444.1.444 HTTP/1.1" 304 - "htttps://testwebsite.com/m/see/yup/Union?selpt=stepReportFilter.jsp" "Mozilla/5.0 (Windows NT 6.1; Trident/7.0; rv:11.0) like Gecko"

10.10.10.10 - - [10/Oct/2018:11:22:47 -0700] "POST /nba/nfl/nhl/ufc HTTP/1.1" 200 470 "-" "Mozilla/4.0 (Windows 8.1 6.3) Java/1.2.0_181" "10.10.10.02"

dnsname..cod.blackops.com:80 10.10.10.02 - - [16/Oct/2018:11:22:22 -0700] "GET /scripts/form_registry.js HTTP/1.1" 200 2504 "htttp://10.10.10.03lnba/cruisehtml?&swf_version=ezboard052614_1&serverUrl=110.10.10.03&boardId=19-153970030&isPreview=0&update052109=1" "Mozilla/5.0 (Windows NT 6.1; Trident/2.0; rv:11.0) like Gecko"

10.10.10.02 - - [12/Oct/2018:13:22:41 -0500] "POST /yup/zillow/server.php?a=c7355 HTTP/1.1" 200 - "-" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/515.2 (KHTML, like Gecko) Chrome/15.0.200.200 Safari/535.2"

10.10.10.02 - - [11/Oct/2018:13:22:41 -0500] "POST /yup/zillow/server.php?a=c7355 HTTP/1.1" 200 - 

10.10.10.22 - - [12/Oct/2012:14:22:41 -0400] "GET /etc/team/transport/tRoom?serlet=jpsSSGenerator HTTP/1.1" 200 26494
Tags (1)
0 Karma
1 Solution

sudosplunk
Motivator

Hi @mrtolu6,

Give this regex a try: your base search | rex field=_raw (?<src>\d+\.\d+\.\d+\.\d+).+\]\s\"(?<http_method>\w+)\s(?<uri_path>.+)\"\s(?<status>\d+)

Tested the regex here1.

View solution in original post

adonio
Ultra Champion

why not use the pre-built sourcetype access_combined?

see here:
https://docs.splunk.com/Documentation/Splunk/7.2.0/Data/Listofpretrainedsourcetypes

0 Karma

sudosplunk
Motivator

Hi @mrtolu6,

Give this regex a try: your base search | rex field=_raw (?<src>\d+\.\d+\.\d+\.\d+).+\]\s\"(?<http_method>\w+)\s(?<uri_path>.+)\"\s(?<status>\d+)

Tested the regex here1.

mrtolu6
Path Finder

that worked but it adds extra details in the uri_path fields. If i wanted to created additional fields called uri_query that would create a new field for anything after the "?", also would like to create a version field forhe HTTP1/1 called version.

For example
10.10.10.04 - - [07/Oct/23:08:30:59 -0400] "POST /OndnForm/drag_Form?images/ HTTP/1.1" 400 226 "-" "Hello, People"

the uri_query=images/
version= HTTP/1.1
src=10.10.10.04
status=400
bytes=226

0 Karma

sudosplunk
Motivator

Try this:

your base search | rex field=_raw "(?<src>\d+\.\d+\.\d+\.\d+).+\]\s\"(?<http_method>\w+)\s(?<uri_path>\S+)\s(?<uri_query>\S+)\"\s(?<status>\d+)\s(?<bytes>[\d-]+)"

Updated regex https://regex101.com/r/CpQ56P/2

0 Karma

mrtolu6
Path Finder

thanks for your help!

0 Karma
Get Updates on the Splunk Community!

Introducing Value Insights (Beta): Understand the Business Impact your organization ...

Real progress on your strategic priorities starts with knowing the business outcomes your teams are delivering ...

Enterprise Security (ES) Essentials 8.3 is Now GA — Smarter Detections, Faster ...

As of today, Enterprise Security (ES) Essentials 8.3 is now generally available, helping SOC teams simplify ...

Unlock Instant Security Insights from Amazon S3 with Splunk Cloud — Try Federated ...

Availability: Must be on Splunk Cloud Platform version 10.1.2507.x to view the free trial banner. If you are ...