All Apps and Add-ons

Help with regex

lemikg
Communicator

Hi,

I extracted a field with Splunk Field Extractor which seemed to work until I noticed it didn't capture all messages (i.e. CSRF Attack Detected - Missing CSRF Token) from ModSecurity.

Here some Log msg:

--f7d234hc-H--
Message: Warning. Match of "eq 1" against "&ARGS:CSRF_TOKEN" required. [file "/cut/modsecurity_crs_43_csrf_protection.conf"] [line "31"] [id "981143"] [msg "CSRF Attack Detected - Missing CSRF Token."]
Message: Failed to write to DBM file "/tmp/global": Invalid argument
Apache-Handler: perl-script
--f7d3t15d-Z--

This is what the app gave me

(?s)--[0-9a-f]+-H--\n.*\[msg \"(?P<msg>[\w\s\/.]+)\"\]

Is there something wrong with it? Can it be done more efficiently?

Thanks in advance.

Cheers
Mike

Tags (3)
0 Karma
1 Solution

dmr195
Communicator

I think it's because there's a hyphen missing inside the innermost square brackets. Try:

(?s)--[0-9a-f]+-H--\n.*\[msg \"(?P<msg>[\w\s\/.-]+)\"\]

instead. (In case it's hard to see, the difference is 8 characters from the end.)

Your previous regex was only looking for letters, numbers, underscores, whitespace, slashes and dots between the double quotes. Hence it didn't match because "CSRF Attack Detected - Missing CSRF Token" has a hyphen in the middle.

View solution in original post

bjoernjensen
Contributor

Hi,

here are more things to be considered:

(a) it seams that the message does not start with a hex-coded ID in hyphens and that "H"
(b) you aren't getting the whole message text if it contains a hyphen

Something like this should work:
(?s)--[0-9a-z]+-[A-Z]--\n.*\[msg \"(?P<msg>[-\w\s\/.]+)\"\]

dmr195
Communicator

I feel a little guilty that my answer was accepted here, as I missed the first required change. The regex in this answer is the one to use.

0 Karma

lemikg
Communicator

thanks to you, too. I tried that as well and worked. have a great one.
cheers
Mike

0 Karma

dmr195
Communicator

I think it's because there's a hyphen missing inside the innermost square brackets. Try:

(?s)--[0-9a-f]+-H--\n.*\[msg \"(?P<msg>[\w\s\/.-]+)\"\]

instead. (In case it's hard to see, the difference is 8 characters from the end.)

Your previous regex was only looking for letters, numbers, underscores, whitespace, slashes and dots between the double quotes. Hence it didn't match because "CSRF Attack Detected - Missing CSRF Token" has a hyphen in the middle.

lemikg
Communicator

It seems, that did the trick. Thank you very much.

0 Karma
Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...