Getting Data In

Field extraction for custom waf log

b_chris21
Path Finder

Hello everyone,

I am struggling with extracting the fields of a custom WAF log file as there is no sourcetype that parses the fields correctly. My regex experience is very limited so any help would be appreciated.

The log output is:

*************************************************************************
Attack blocked, match (torro!234) detected from 1.2.3.4:55488. Time: 2021-03-28 09:09:08
Full request:
*************************************************************************
GET /waf-test-page.php?torro!234 HTTP/1.1
Host: 34.210.25.50
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:87.0) Gecko/20100101 Firefox/87.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive
Cookie: PHPSESSID=e59oluljlvjkeef2ts3gphrt7g
Upgrade-Insecure-Requests: 1


*************************************************************************
Attack blocked, match (union+select) detected from 1.2.3.4:57280. Time: 2021-03-28 09:10:19
Full request:
*************************************************************************
POST /waf-test-page.php HTTP/1.1
Host: 34.210.25.50
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:87.0) Gecko/20100101 Firefox/87.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Content-Type: application/x-www-form-urlencoded
Content-Length: 40
Origin: http://34.210.25.50
Connection: keep-alive
Referer: http://34.210.25.50/waf-test-page.php
Cookie: PHPSESSID=e59oluljlvjkeef2ts3gphrt7g
Upgrade-Insecure-Requests: 1

os='+UNION+SELECT+1,2,3&php=&path=
*************************************************************************
Attack blocked, match (<script) detected from 1.2.3.4:53248. Time: 2021-03-28 09:12:38
Full request:
*************************************************************************
GET /waf-test-page.php?path=5"><script> HTTP/1.1
Host: 34.210.25.50
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:87.0) Gecko/20100101 Firefox/87.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive
Cookie: PHPSESSID=e59oluljlvjkeef2ts3gphrt7g
Upgrade-Insecure-Requests: 1


*************************************************************************
Attack blocked, match (IP BLACKLISTED) detected from 1.2.3.4:56704. Time: 2021-03-28 09:19:02
Full request:
*************************************************************************
GET /waf-test-page.php?test_block_ip HTTP/1.1
Host: 34.210.25.50
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:87.0) Gecko/20100101 Firefox/87.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive
Cookie: PHPSESSID=e59oluljlvjkeef2ts3gphrt7g
Upgrade-Insecure-Requests: 1


*************************************************************************

I would like to extract the following 3 fields:

alert
Eg. "Attack blocked, match (torro!234) detected from 1.2.3.4:55488. Time: 2021-03-28 09:09:08"

http.request
Eg. "GET /waf-test-page.php?torro!234 HTTP/1.1"

host.ip
Eg. "34.210.25.50"

Thank you in advance.

Chris

 

Labels (2)
0 Karma
1 Solution

richgalloway
SplunkTrust
SplunkTrust

Start with the Add Data wizard to get the events onboarded.  I used these props.conf settings:

[ mysourcetype]
SHOULD_LINEMERGE=true
LINE_BREAKER=(\*+[\r\n]+)Attack blocked
NO_BINARY_CHECK=true
TIME_PREFIX=Time:\s
TIME_FORMAT=%Y-%m-%d %H:%M:%S

Then I used the Field Extractor to produce this, although I created the regex manually rather than letting the extractor do it.

EXTRACT-src_ip,src_port,http_method,http_request,host_ip = from\s(?P<src_ip>[^:]+):(?<src_port>\d+)[\s\S]+(?<http_method>GET|POST)\s(?<http_request>.*?)Host:\s+(?<host_ip>\d+\.\d+\.\d+\.\d+)
---
If this reply helps you, an upvote would be appreciated.

View solution in original post

richgalloway
SplunkTrust
SplunkTrust

Start with the Add Data wizard to get the events onboarded.  I used these props.conf settings:

[ mysourcetype]
SHOULD_LINEMERGE=true
LINE_BREAKER=(\*+[\r\n]+)Attack blocked
NO_BINARY_CHECK=true
TIME_PREFIX=Time:\s
TIME_FORMAT=%Y-%m-%d %H:%M:%S

Then I used the Field Extractor to produce this, although I created the regex manually rather than letting the extractor do it.

EXTRACT-src_ip,src_port,http_method,http_request,host_ip = from\s(?P<src_ip>[^:]+):(?<src_port>\d+)[\s\S]+(?<http_method>GET|POST)\s(?<http_request>.*?)Host:\s+(?<host_ip>\d+\.\d+\.\d+\.\d+)
---
If this reply helps you, an upvote would be appreciated.

View solution in original post

b_chris21
Path Finder

Hi @richgalloway,

thanks for your detailed answer. I have added manually the data, added the props.conf as instructed, also manually added the extraction regex but unfortunately fields were not extracted. Requested fields are created but there are no extracted value on them.

Did I miss something?

Thanks for your support.

Best regards,

Chris

0 Karma

richgalloway
SplunkTrust
SplunkTrust

I think my answer was unclear.  You didn't need to run the field extractor yourself - just drop the settings I gave you into props.conf and restart.

Text text you entered into the field extractor is a props.conf setting.  The only part that needs to be in the extractor is the regex itself (the part after the =).

---
If this reply helps you, an upvote would be appreciated.

b_chris21
Path Finder

Thanks now it works great! Could you please also add an extraction field of the alert itself?

It is expected to be as the whole first line:

Eg.

"Attack blocked, match (torro!234) detected from 1.2.3.4:55488. Time: 2021-03-28 09:09:08"

 

Thank you in advance for your support.

Chris

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Try this

EXTRACT-alert = (?<attack>Attack [\s\S]+)Full
---
If this reply helps you, an upvote would be appreciated.
.conf21 Now Fully Virtual!
Register for FREE Today!

We've made .conf21 totally virtual and totally FREE! Our completely online experience will run from 10/19 through 10/20 with some additional events, too!