All Apps and Add-ons

Help with regex

lemikg
Communicator

Hi,

I extracted a field with Splunk Field Extractor which seemed to work until I noticed it didn't capture all messages (i.e. CSRF Attack Detected - Missing CSRF Token) from ModSecurity.

Here some Log msg:

--f7d234hc-H--
Message: Warning. Match of "eq 1" against "&ARGS:CSRF_TOKEN" required. [file "/cut/modsecurity_crs_43_csrf_protection.conf"] [line "31"] [id "981143"] [msg "CSRF Attack Detected - Missing CSRF Token."]
Message: Failed to write to DBM file "/tmp/global": Invalid argument
Apache-Handler: perl-script
--f7d3t15d-Z--

This is what the app gave me

(?s)--[0-9a-f]+-H--\n.*\[msg \"(?P<msg>[\w\s\/.]+)\"\]

Is there something wrong with it? Can it be done more efficiently?

Thanks in advance.

Cheers
Mike

Tags (3)
0 Karma
1 Solution

dmr195
Communicator

I think it's because there's a hyphen missing inside the innermost square brackets. Try:

(?s)--[0-9a-f]+-H--\n.*\[msg \"(?P<msg>[\w\s\/.-]+)\"\]

instead. (In case it's hard to see, the difference is 8 characters from the end.)

Your previous regex was only looking for letters, numbers, underscores, whitespace, slashes and dots between the double quotes. Hence it didn't match because "CSRF Attack Detected - Missing CSRF Token" has a hyphen in the middle.

View solution in original post

bjoernjensen
Contributor

Hi,

here are more things to be considered:

(a) it seams that the message does not start with a hex-coded ID in hyphens and that "H"
(b) you aren't getting the whole message text if it contains a hyphen

Something like this should work:
(?s)--[0-9a-z]+-[A-Z]--\n.*\[msg \"(?P<msg>[-\w\s\/.]+)\"\]

dmr195
Communicator

I feel a little guilty that my answer was accepted here, as I missed the first required change. The regex in this answer is the one to use.

0 Karma

lemikg
Communicator

thanks to you, too. I tried that as well and worked. have a great one.
cheers
Mike

0 Karma

dmr195
Communicator

I think it's because there's a hyphen missing inside the innermost square brackets. Try:

(?s)--[0-9a-f]+-H--\n.*\[msg \"(?P<msg>[\w\s\/.-]+)\"\]

instead. (In case it's hard to see, the difference is 8 characters from the end.)

Your previous regex was only looking for letters, numbers, underscores, whitespace, slashes and dots between the double quotes. Hence it didn't match because "CSRF Attack Detected - Missing CSRF Token" has a hyphen in the middle.

lemikg
Communicator

It seems, that did the trick. Thank you very much.

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

May 2026 Splunk Expert Sessions: Security & Observability

Level Up Your Operations: May 2026 Splunk Expert Sessions Whether you are refining your security posture or ...

Network to App: Observability Unlocked [May & June Series]

In today’s digital landscape, your environment is no longer confined to the data center. It spans complex ...

SPL2 Deep Dives, AppDynamics Integrations, SAML Made Simple and Much More on Splunk ...

Splunk Lantern is Splunk’s customer success center that provides practical guidance from Splunk experts on key ...