Splunk Dev

Can you help me extract fields from apache:access logs?

mrtolu6
Path Finder

Regex Experts!
Need help in extracting src, http_method, uri_path, status field.

Below is an example of a log with the fields that I would like to extract :

"10.10.10.22 - - [12/Oct/2012:14:22:41 -0400] "GET /etc/team/transport/tRoom?serlet=jpsSSGenerator HTTP/1.1" 200 26494"
src=10.10.10.22, http_method=GET, uri_path= /etc/team/transport/Room?serlet=jpsSSGenerator

This is example of different types of logs that comes from apache access logs. I'm looking for a regex that can extract fields from the example below. Thanks in advance for any help.

example logs

127.0.0.1 - - [104/Oct/2018:11:22:47 -0700] "GET /directory/directory/test?seat=ShowPage&tese=calendar.js&IP=444.444.1.444 HTTP/1.1" 304 - "htttps://testwebsite.com/m/see/yup/Union?selpt=stepReportFilter.jsp" "Mozilla/5.0 (Windows NT 6.1; Trident/7.0; rv:11.0) like Gecko"

10.10.10.10 - - [10/Oct/2018:11:22:47 -0700] "POST /nba/nfl/nhl/ufc HTTP/1.1" 200 470 "-" "Mozilla/4.0 (Windows 8.1 6.3) Java/1.2.0_181" "10.10.10.02"

dnsname..cod.blackops.com:80 10.10.10.02 - - [16/Oct/2018:11:22:22 -0700] "GET /scripts/form_registry.js HTTP/1.1" 200 2504 "htttp://10.10.10.03lnba/cruisehtml?&swf_version=ezboard052614_1&serverUrl=110.10.10.03&boardId=19-153970030&isPreview=0&update052109=1" "Mozilla/5.0 (Windows NT 6.1; Trident/2.0; rv:11.0) like Gecko"

10.10.10.02 - - [12/Oct/2018:13:22:41 -0500] "POST /yup/zillow/server.php?a=c7355 HTTP/1.1" 200 - "-" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/515.2 (KHTML, like Gecko) Chrome/15.0.200.200 Safari/535.2"

10.10.10.02 - - [11/Oct/2018:13:22:41 -0500] "POST /yup/zillow/server.php?a=c7355 HTTP/1.1" 200 - 

10.10.10.22 - - [12/Oct/2012:14:22:41 -0400] "GET /etc/team/transport/tRoom?serlet=jpsSSGenerator HTTP/1.1" 200 26494
Tags (1)
0 Karma
1 Solution

sudosplunk
Motivator

Hi @mrtolu6,

Give this regex a try: your base search | rex field=_raw (?<src>\d+\.\d+\.\d+\.\d+).+\]\s\"(?<http_method>\w+)\s(?<uri_path>.+)\"\s(?<status>\d+)

Tested the regex here1.

View solution in original post

adonio
Ultra Champion

why not use the pre-built sourcetype access_combined?

see here:
https://docs.splunk.com/Documentation/Splunk/7.2.0/Data/Listofpretrainedsourcetypes

0 Karma

sudosplunk
Motivator

Hi @mrtolu6,

Give this regex a try: your base search | rex field=_raw (?<src>\d+\.\d+\.\d+\.\d+).+\]\s\"(?<http_method>\w+)\s(?<uri_path>.+)\"\s(?<status>\d+)

Tested the regex here1.

mrtolu6
Path Finder

that worked but it adds extra details in the uri_path fields. If i wanted to created additional fields called uri_query that would create a new field for anything after the "?", also would like to create a version field forhe HTTP1/1 called version.

For example
10.10.10.04 - - [07/Oct/23:08:30:59 -0400] "POST /OndnForm/drag_Form?images/ HTTP/1.1" 400 226 "-" "Hello, People"

the uri_query=images/
version= HTTP/1.1
src=10.10.10.04
status=400
bytes=226

0 Karma

sudosplunk
Motivator

Try this:

your base search | rex field=_raw "(?<src>\d+\.\d+\.\d+\.\d+).+\]\s\"(?<http_method>\w+)\s(?<uri_path>\S+)\s(?<uri_query>\S+)\"\s(?<status>\d+)\s(?<bytes>[\d-]+)"

Updated regex https://regex101.com/r/CpQ56P/2

0 Karma

mrtolu6
Path Finder

thanks for your help!

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...