All Apps and Add-ons

Help with regex

lemikg
Communicator

Hi,

I extracted a field with Splunk Field Extractor which seemed to work until I noticed it didn't capture all messages (i.e. CSRF Attack Detected - Missing CSRF Token) from ModSecurity.

Here some Log msg:

--f7d234hc-H--
Message: Warning. Match of "eq 1" against "&ARGS:CSRF_TOKEN" required. [file "/cut/modsecurity_crs_43_csrf_protection.conf"] [line "31"] [id "981143"] [msg "CSRF Attack Detected - Missing CSRF Token."]
Message: Failed to write to DBM file "/tmp/global": Invalid argument
Apache-Handler: perl-script
--f7d3t15d-Z--

This is what the app gave me

(?s)--[0-9a-f]+-H--\n.*\[msg \"(?P<msg>[\w\s\/.]+)\"\]

Is there something wrong with it? Can it be done more efficiently?

Thanks in advance.

Cheers
Mike

Tags (3)
0 Karma
1 Solution

dmr195
Communicator

I think it's because there's a hyphen missing inside the innermost square brackets. Try:

(?s)--[0-9a-f]+-H--\n.*\[msg \"(?P<msg>[\w\s\/.-]+)\"\]

instead. (In case it's hard to see, the difference is 8 characters from the end.)

Your previous regex was only looking for letters, numbers, underscores, whitespace, slashes and dots between the double quotes. Hence it didn't match because "CSRF Attack Detected - Missing CSRF Token" has a hyphen in the middle.

View solution in original post

bjoernjensen
Contributor

Hi,

here are more things to be considered:

(a) it seams that the message does not start with a hex-coded ID in hyphens and that "H"
(b) you aren't getting the whole message text if it contains a hyphen

Something like this should work:
(?s)--[0-9a-z]+-[A-Z]--\n.*\[msg \"(?P<msg>[-\w\s\/.]+)\"\]

dmr195
Communicator

I feel a little guilty that my answer was accepted here, as I missed the first required change. The regex in this answer is the one to use.

0 Karma

lemikg
Communicator

thanks to you, too. I tried that as well and worked. have a great one.
cheers
Mike

0 Karma

dmr195
Communicator

I think it's because there's a hyphen missing inside the innermost square brackets. Try:

(?s)--[0-9a-f]+-H--\n.*\[msg \"(?P<msg>[\w\s\/.-]+)\"\]

instead. (In case it's hard to see, the difference is 8 characters from the end.)

Your previous regex was only looking for letters, numbers, underscores, whitespace, slashes and dots between the double quotes. Hence it didn't match because "CSRF Attack Detected - Missing CSRF Token" has a hyphen in the middle.

lemikg
Communicator

It seems, that did the trick. Thank you very much.

0 Karma
Get Updates on the Splunk Community!

.conf24 | Day 0

Hello Splunk Community! My name is Chris, and I'm based in Canberra, Australia's capital, and I travelled for ...

Enhance Security Visibility with Splunk Enterprise Security 7.1 through Threat ...

 (view in My Videos)Struggling with alert fatigue, lack of context, and prioritization around security ...

Troubleshooting the OpenTelemetry Collector

  In this tech talk, you’ll learn how to troubleshoot the OpenTelemetry collector - from checking the ...