Splunk Search

Regex extract all matched field values per event

d_T
New Member

Hi Splunk Community,

I have run into an interesting scenario where I need to write a field extraction that will parse a specific part of WinEventLog add-on data, and return the results. This is related to Log4j vulnerability, so it has some real value hopefully. 

The issue I am running into is that the regex that I have built will match java files that contain 'log4j', but it will only extract the first instance in the body of the text that it sees vs all instances of log4j files. I believe I need a way to perform a positive lookahead (or something similar) match the results, and then continue to match results on that same event before moving on.

Example Data below:

Field Value Data:
"C:\Something Something\Something Something Base\jre\bin\javaw.exe" -cp "C:\Something Something\Something Something Base\lib\patches.jar/;C:\Something Something\Something Something Base/classes;C:\Something Something\Something Something Base\lib/aopalliance-repackaged-2.5.0-b42.jar;C:\Something Something\Something Something Base\lib/slf4j-log4j12-1.7.5.jar;C:\Something Something\Something Something Base\lib/javax.annotation-api-1.2.jar;C:\Something Something\Something Something Base\lib/log4j-1.2-api-2.15.0.jar";C:\Something Something\Something Something Base//log/ff3ad640-9eb4-11eb-a0b2-1de605f6535b\mini_probe\23468" 101_input.txt

First Extraction Query Attempt:
Query - | rex field=Process_Command_Line "(?P<hasLog4>(?:([\/log4j]{6}.*?(?=;))))"
Result/log4j-1.2-api-2.15.0.jar

The problem with the above extraction is that while it will match 'log4j' files, it will only match the first occurrence of it in the field value above and then move on the next event. I need it to essentially read through the entire string and extract all instances of the matched regex before moving to the next event. Also as you can see it can miss certain types of 'log4j' files, so I will need to clean up the regex anyways to fix that.

Second Extraction Query Attempt:
Query - | rex field=Process_Command_Line "(?P<test>C:(.*?)(?=jar|exe))"
Result - C:\Something Something\Something Something Base\jre\bin\javaw

The problem with this query is that it matches immediately with the first result in the field value and then moves on to the next event and never gets to where the 'log4j' file exists in the string. 

Labels (4)
0 Karma

isoutamo
SplunkTrust
SplunkTrust
Hi
when you want multiple match on the same line you must add on rex command max_match=0 to match all. See https://docs.splunk.com/Documentation/Splunk/8.2.3/SearchReference/Rex
r. Ismo
0 Karma
Get Updates on the Splunk Community!

Join Us for Splunk University and Get Your Bootcamp Game On!

If you know, you know! Splunk University is the vibe this summer so register today for bootcamps galore ...

.conf24 | Learning Tracks for Security, Observability, Platform, and Developers!

.conf24 is taking place at The Venetian in Las Vegas from June 11 - 14. Continue reading to learn about the ...

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...