Splunk Search

Why is my REGEX and MV_ADD=true in transforms.conf not working as expected to extract fields from Windows event logs?

fairje
Communicator

I am trying to parse out the EMET (Enhanced Mitigation Experience Toolkit) logs (note when I get this whole thing working, I plan to share this far and wide so MS will stop trying to sell you on their crappy products to monitor these same logs). In any case, we currently have the GPO/Registry configurations being kicked to EventCode 50 and they look something similar to below:

01/12/2016 05:00:05 PM
LogName=Application
SourceName=EMET
EventCode=50
EventType=4
Type=Information
ComputerName=host001.com
TaskCategory=%1
OpCode=Info
RecordNumber=267548
Keywords=Classic
Message=EMET settings were refreshed successfully.

EMET configuration for Application mitigations (Registry) is:
<ConfigAppmitREG>
</ConfigAppmitREG>

EMET configuration for Application mitigations (GPO) is:
<ConfigAppmitGPO>
7z.exe *\7-Zip  DEP SEHOP NullPage HeapSpray MandatoryASLR BottomUpASLR LoadLib MemProt Caller SimExecFlow StackPivot
7zFM.exe *\7-Zip  DEP SEHOP NullPage HeapSpray MandatoryASLR BottomUpASLR LoadLib MemProt Caller SimExecFlow StackPivot
7zG.exe *\7-Zip  DEP SEHOP NullPage HeapSpray MandatoryASLR BottomUpASLR LoadLib MemProt Caller SimExecFlow StackPivot
Acrobat.exe *\Adobe\Acrobat*\Acrobat  DEP SEHOP NullPage HeapSpray EAF MandatoryASLR BottomUpASLR LoadLib Caller SimExecFlow StackPivot
AcroRd32.exe *\Adobe\Reader*\Reader  DEP SEHOP NullPage HeapSpray EAF MandatoryASLR BottomUpASLR LoadLib Caller SimExecFlow StackPivot
chrome.exe *\Google\Chrome\Application  DEP NullPage HeapSpray EAF MandatoryASLR BottomUpASLR LoadLib MemProt Caller SimExecFlow StackPivot
communicator.exe *\Microsoft Lync  DEP SEHOP NullPage HeapSpray EAF MandatoryASLR BottomUpASLR LoadLib MemProt Caller SimExecFlow StackPivot
EXCEL.EXE *\OFFICE1*  DEP SEHOP NullPage HeapSpray EAF MandatoryASLR BottomUpASLR LoadLib MemProt Caller SimExecFlow StackPivot
firefox.exe *\Mozilla Firefox  DEP SEHOP NullPage HeapSpray EAF MandatoryASLR BottomUpASLR LoadLib MemProt Caller SimExecFlow StackPivot
...snip more apps...
</ConfigAppmitGPO>

There are a couple other events generated from EMET on Event 50, but this is the important one because it tells you how you are getting certain settings (registry keys or GPO) and it also tells you what your hosts are configured as (in case you have different configs in your environment for different reasons).

Now here is the nightmare. How to extract the REGEX statement on your transforms to parse all this information out. So to start with, I was toying around with the rex search command and got success pulling out all the application names as such:

| rex max_match=0 field=Message "(?m)^(?<App_Name>.*\.[exeEXE]{3})"

I am using the fact that the Message field is already pulled by Splunk having the Windows TA installed and it's general = extractions. Mostly that gives me everything after Message= in the logs. The regex above actually works to pull out (especially with max_match at 0 - unlimited) all the app names in a single event. When I tried to throw that in transforms.conf, it all falls apart and just doesn't work with no apparent reason why not.

[emet_event50_app_from_Message]
SOURCE_KEY = Message
REGEX = (?m)^(?<App_Name>.*\.[exeEXE]{3})
MV_ADD = true

Essentially the MV_ADD should make it pull all the matches, not just the first one. But instead, the results I get is a regrab of the entire message data e.g.:

EMET settings were refreshed successfully. EMET configuration for Application mitigations (Registry) is: <ConfigAppmitREG> </ConfigAppmitREG> EMET configuration for Application mitigations (GPO) is: <ConfigAppmitGPO> 7z.exe *\7-Zip DEP SEHOP NullPage HeapSpray MandatoryASLR BottomUpASLR LoadLib MemProt Caller SimExecFlow StackPivot 7zFM.exe *\7-Zip DEP SEHOP NullPage HeapSpray MandatoryASLR BottomUpASLR LoadLib MemProt Caller... and so on

I have never really tried working with a multiline event in Splunk from the transforms file before, so I am not sure what I am missing here. And reading other Splunk Answers seems to indicate that the above should be right, but it just isn't working.

Thanks for the assist!

0 Karma
1 Solution

fairje
Communicator

So I have worked around the issue with the following:

[emet_event50_app_from_Message]
SOURCE_KEY = Message
REGEX = \n(?<App_Name>.*\.[exeEXE]{3})\s\S
MV_ADD = true

Clearly you can see that the newline character is there, because this totally works for the logs, but it doesn't want to accept the (?m) option in the front so you can actually switch to using the the caret "^" character. This is frustrating because in other logs I have used the (?m) option.

As has been suggested it may have something to do with the way splunk is extracting the "Message" field. I haven't tried an extraction in the transforms using _raw, maybe that would also be a solution.

Note about the above regex, I have to use the * character on this to work correctly, since some application names have whitespace in them along with words. As long as you anchor to the newline and stop when it finds "exe" or "EXE" then that should be sufficient for grabbing this data on EMET logs.

Thank you gcato for the assistance on getting to the bottom of this. Your responses were appreciated!

View solution in original post

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...