Getting Data In

Field extraction for custom waf log

b_chris21
Communicator

Hello everyone,

I am struggling with extracting the fields of a custom WAF log file as there is no sourcetype that parses the fields correctly. My regex experience is very limited so any help would be appreciated.

The log output is:

*************************************************************************
Attack blocked, match (torro!234) detected from 1.2.3.4:55488. Time: 2021-03-28 09:09:08
Full request:
*************************************************************************
GET /waf-test-page.php?torro!234 HTTP/1.1
Host: 34.210.25.50
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:87.0) Gecko/20100101 Firefox/87.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive
Cookie: PHPSESSID=e59oluljlvjkeef2ts3gphrt7g
Upgrade-Insecure-Requests: 1


*************************************************************************
Attack blocked, match (union+select) detected from 1.2.3.4:57280. Time: 2021-03-28 09:10:19
Full request:
*************************************************************************
POST /waf-test-page.php HTTP/1.1
Host: 34.210.25.50
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:87.0) Gecko/20100101 Firefox/87.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Content-Type: application/x-www-form-urlencoded
Content-Length: 40
Origin: http://34.210.25.50
Connection: keep-alive
Referer: http://34.210.25.50/waf-test-page.php
Cookie: PHPSESSID=e59oluljlvjkeef2ts3gphrt7g
Upgrade-Insecure-Requests: 1

os='+UNION+SELECT+1,2,3&php=&path=
*************************************************************************
Attack blocked, match (<script) detected from 1.2.3.4:53248. Time: 2021-03-28 09:12:38
Full request:
*************************************************************************
GET /waf-test-page.php?path=5"><script> HTTP/1.1
Host: 34.210.25.50
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:87.0) Gecko/20100101 Firefox/87.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive
Cookie: PHPSESSID=e59oluljlvjkeef2ts3gphrt7g
Upgrade-Insecure-Requests: 1


*************************************************************************
Attack blocked, match (IP BLACKLISTED) detected from 1.2.3.4:56704. Time: 2021-03-28 09:19:02
Full request:
*************************************************************************
GET /waf-test-page.php?test_block_ip HTTP/1.1
Host: 34.210.25.50
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:87.0) Gecko/20100101 Firefox/87.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive
Cookie: PHPSESSID=e59oluljlvjkeef2ts3gphrt7g
Upgrade-Insecure-Requests: 1


*************************************************************************

I would like to extract the following 3 fields:

alert
Eg. "Attack blocked, match (torro!234) detected from 1.2.3.4:55488. Time: 2021-03-28 09:09:08"

http.request
Eg. "GET /waf-test-page.php?torro!234 HTTP/1.1"

host.ip
Eg. "34.210.25.50"

Thank you in advance.

Chris

 

Labels (2)
0 Karma
1 Solution

richgalloway
SplunkTrust
SplunkTrust

Start with the Add Data wizard to get the events onboarded.  I used these props.conf settings:

[ mysourcetype]
SHOULD_LINEMERGE=true
LINE_BREAKER=(\*+[\r\n]+)Attack blocked
NO_BINARY_CHECK=true
TIME_PREFIX=Time:\s
TIME_FORMAT=%Y-%m-%d %H:%M:%S

Then I used the Field Extractor to produce this, although I created the regex manually rather than letting the extractor do it.

EXTRACT-src_ip,src_port,http_method,http_request,host_ip = from\s(?P<src_ip>[^:]+):(?<src_port>\d+)[\s\S]+(?<http_method>GET|POST)\s(?<http_request>.*?)Host:\s+(?<host_ip>\d+\.\d+\.\d+\.\d+)
---
If this reply helps you, Karma would be appreciated.

View solution in original post

b_chris21
Communicator

Hello again,

may I also request some help on extracting fields from these events of the same WAF?

Fields:

signature:  `New request from 127.0.0.1:52674`

host: to include also dns name (eg. one.com:8080)

New request from 127.0.0.1:52672
GET / HTTP/1.1
Host: one.com:8080
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Firefox/91.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Connection: keep-alive
Upgrade-Insecure-Requests: 1


New request from 127.0.0.1:52672
GET /resources/bootstrap.css HTTP/1.1
Host: one.com:8080
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Firefox/91.0
Accept: text/css,*/*;q=0.1
Accept-Language: en-US,en;q=0.5
Connection: keep-alive
Referer: http://one.com:8080/


New request from 127.0.0.1:52674
GET /resources/floating-labels.css HTTP/1.1
Host: one.com:8080
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Firefox/91.0
Accept: text/css,*/*;q=0.1
Accept-Language: en-US,en;q=0.5
Connection: keep-alive
Referer: http://one.com:8080/

Many thanks in advance.

Chris

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Start with the Add Data wizard to get the events onboarded.  I used these props.conf settings:

[ mysourcetype]
SHOULD_LINEMERGE=true
LINE_BREAKER=(\*+[\r\n]+)Attack blocked
NO_BINARY_CHECK=true
TIME_PREFIX=Time:\s
TIME_FORMAT=%Y-%m-%d %H:%M:%S

Then I used the Field Extractor to produce this, although I created the regex manually rather than letting the extractor do it.

EXTRACT-src_ip,src_port,http_method,http_request,host_ip = from\s(?P<src_ip>[^:]+):(?<src_port>\d+)[\s\S]+(?<http_method>GET|POST)\s(?<http_request>.*?)Host:\s+(?<host_ip>\d+\.\d+\.\d+\.\d+)
---
If this reply helps you, Karma would be appreciated.

b_chris21
Communicator

Hi @richgalloway,

thanks for your detailed answer. I have added manually the data, added the props.conf as instructed, also manually added the extraction regex but unfortunately fields were not extracted. Requested fields are created but there are no extracted value on them.

Did I miss something?

Thanks for your support.

Best regards,

Chris

0 Karma

richgalloway
SplunkTrust
SplunkTrust

I think my answer was unclear.  You didn't need to run the field extractor yourself - just drop the settings I gave you into props.conf and restart.

Text text you entered into the field extractor is a props.conf setting.  The only part that needs to be in the extractor is the regex itself (the part after the =).

---
If this reply helps you, Karma would be appreciated.

b_chris21
Communicator

Thanks now it works great! Could you please also add an extraction field of the alert itself?

It is expected to be as the whole first line:

Eg.

"Attack blocked, match (torro!234) detected from 1.2.3.4:55488. Time: 2021-03-28 09:09:08"

 

Thank you in advance for your support.

Chris

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Try this

EXTRACT-alert = (?<attack>Attack [\s\S]+)Full
---
If this reply helps you, Karma would be appreciated.
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...