Splunk Search
Highlighted

Why is my REGEX and MV_ADD=true in transforms.conf not working as expected to extract fields from Windows event logs?

Communicator

I am trying to parse out the EMET (Enhanced Mitigation Experience Toolkit) logs (note when I get this whole thing working, I plan to share this far and wide so MS will stop trying to sell you on their crappy products to monitor these same logs). In any case, we currently have the GPO/Registry configurations being kicked to EventCode 50 and they look something similar to below:

01/12/2016 05:00:05 PM
LogName=Application
SourceName=EMET
EventCode=50
EventType=4
Type=Information
ComputerName=host001.com
TaskCategory=%1
OpCode=Info
RecordNumber=267548
Keywords=Classic
Message=EMET settings were refreshed successfully.

EMET configuration for Application mitigations (Registry) is:
<ConfigAppmitREG>
</ConfigAppmitREG>

EMET configuration for Application mitigations (GPO) is:
<ConfigAppmitGPO>
7z.exe *\7-Zip  DEP SEHOP NullPage HeapSpray MandatoryASLR BottomUpASLR LoadLib MemProt Caller SimExecFlow StackPivot
7zFM.exe *\7-Zip  DEP SEHOP NullPage HeapSpray MandatoryASLR BottomUpASLR LoadLib MemProt Caller SimExecFlow StackPivot
7zG.exe *\7-Zip  DEP SEHOP NullPage HeapSpray MandatoryASLR BottomUpASLR LoadLib MemProt Caller SimExecFlow StackPivot
Acrobat.exe *\Adobe\Acrobat*\Acrobat  DEP SEHOP NullPage HeapSpray EAF MandatoryASLR BottomUpASLR LoadLib Caller SimExecFlow StackPivot
AcroRd32.exe *\Adobe\Reader*\Reader  DEP SEHOP NullPage HeapSpray EAF MandatoryASLR BottomUpASLR LoadLib Caller SimExecFlow StackPivot
chrome.exe *\Google\Chrome\Application  DEP NullPage HeapSpray EAF MandatoryASLR BottomUpASLR LoadLib MemProt Caller SimExecFlow StackPivot
communicator.exe *\Microsoft Lync  DEP SEHOP NullPage HeapSpray EAF MandatoryASLR BottomUpASLR LoadLib MemProt Caller SimExecFlow StackPivot
EXCEL.EXE *\OFFICE1*  DEP SEHOP NullPage HeapSpray EAF MandatoryASLR BottomUpASLR LoadLib MemProt Caller SimExecFlow StackPivot
firefox.exe *\Mozilla Firefox  DEP SEHOP NullPage HeapSpray EAF MandatoryASLR BottomUpASLR LoadLib MemProt Caller SimExecFlow StackPivot
...snip more apps...
</ConfigAppmitGPO>

There are a couple other events generated from EMET on Event 50, but this is the important one because it tells you how you are getting certain settings (registry keys or GPO) and it also tells you what your hosts are configured as (in case you have different configs in your environment for different reasons).

Now here is the nightmare. How to extract the REGEX statement on your transforms to parse all this information out. So to start with, I was toying around with the rex search command and got success pulling out all the application names as such:

| rex max_match=0 field=Message "(?m)^(?<App_Name>.*\.[exeEXE]{3})"

I am using the fact that the Message field is already pulled by Splunk having the Windows TA installed and it's general = extractions. Mostly that gives me everything after Message= in the logs. The regex above actually works to pull out (especially with max_match at 0 - unlimited) all the app names in a single event. When I tried to throw that in transforms.conf, it all falls apart and just doesn't work with no apparent reason why not.

[emet_event50_app_from_Message]
SOURCE_KEY = Message
REGEX = (?m)^(?<App_Name>.*\.[exeEXE]{3})
MV_ADD = true

Essentially the MV_ADD should make it pull all the matches, not just the first one. But instead, the results I get is a regrab of the entire message data e.g.:

EMET settings were refreshed successfully. EMET configuration for Application mitigations (Registry) is: <ConfigAppmitREG> </ConfigAppmitREG> EMET configuration for Application mitigations (GPO) is: <ConfigAppmitGPO> 7z.exe *\7-Zip DEP SEHOP NullPage HeapSpray MandatoryASLR BottomUpASLR LoadLib MemProt Caller SimExecFlow StackPivot 7zFM.exe *\7-Zip DEP SEHOP NullPage HeapSpray MandatoryASLR BottomUpASLR LoadLib MemProt Caller... and so on

I have never really tried working with a multiline event in Splunk from the transforms file before, so I am not sure what I am missing here. And reading other Splunk Answers seems to indicate that the above should be right, but it just isn't working.

Thanks for the assist!

0 Karma
Highlighted

Re: Why is my REGEX and MV_ADD=true in transforms.conf not working as expected to extract fields from Windows event logs?

Contributor

Hi fairje,

It would appear that the newlines in your Message field are no longer there (i.e. it is not multiline anymore, but one long string) so your regex no longer works. Or at least the logic no longer works. It matches from the Message string beginning to the last .exe it finds and that's what you see returned.

To fix you need to use a different REGEX. This is a perfect place to use regex's lookahead (?=...) syntax. Try using the following REGEX which should find all .exe files in the string (assuming no whitespace in file names).

REGEX = \s(?<App_Name>\w*(?=\.[exeEXE]{3}( |\z))\.[exeEXE]{3})

I tried this at regex101 and it works on your example data. You can find it here if you want to check what the regex syntax means: https://regex101.com/r/hO9iD8/2

This is also a great regex resource if you get stuck: http://www.rexegg.com/regex-lookarounds.html

Hope this helps.

0 Karma
Highlighted

Re: Why is my REGEX and MV_ADD=true in transforms.conf not working as expected to extract fields from Windows event logs?

Contributor

Hi fairje,

Did you manage to get the extraction working okay? It would be good to know if the answer worked so it may be useful for other users.

0 Karma
Highlighted

Re: Why is my REGEX and MV_ADD=true in transforms.conf not working as expected to extract fields from Windows event logs?

Communicator

Sorry for the delay getting back. I am reloading my configuration now and will post back when I get more.

I'm confused though why this doesn't read the newline character... I might try what I did in another REGEX on the same logs as well, which looked like this:

REGEX = (?:\nEMET configuration for |\nEMET )(?<EMETEvent50Type>(?:\w+ status|\w+ Trust|\w+)) (?:is|mitigations)

Note that REGEX does work on these same exact logs. Since in the above log example I provided it would extract:

EMETEvent50Type = "Application"

It's strange that the (?m) doesn't work, when I totally use that in another transforms on a different file. And I think I have either used the (?m) or the (?s) option on a different windows event log before... ::confused::

0 Karma
Highlighted

Re: Why is my REGEX and MV_ADD=true in transforms.conf not working as expected to extract fields from Windows event logs?

Contributor

Hmmm... does this one use the Message field as SOURCE_KEY though or _raw data (default)? I guess it comes back to how the Message field is auto extracted by Splunk. If it's something like (?ms)Message=(?.+) then newlines becomes dots and Message is a single line field. Though the rex search command example in you question indicates newlines are in the Message field. How does the message field appear when you table it i.e. ...search ... | table Message Maybe something like would cover both options.

REGEX = (?:^|\s)(?<App_Name>\w*(?=\.[exeEXE]{3})\.[exeEXE]{3})(?: |\z)+

Try it without (?m) at the start also

0 Karma
Highlighted

Re: Why is my REGEX and MV_ADD=true in transforms.conf not working as expected to extract fields from Windows event logs?

Communicator

By the way, both a ... | table _raw and a ... | table Message returns the same unformatted text stripping away any newline characters. So unfortunately that doesn't tell me anything about what is going on in the background...

0 Karma
Highlighted

Re: Why is my REGEX and MV_ADD=true in transforms.conf not working as expected to extract fields from Windows event logs?

Communicator

Hrmmm, doesn't seem to work as expected. I now only have one extraction which is "Reader.exe" from the events 😞

I'm going to try changing it up from a \s to a \n (for new line) and see if that works since it has worked elsewhere in other events.

0 Karma
Highlighted

Re: Why is my REGEX and MV_ADD=true in transforms.conf not working as expected to extract fields from Windows event logs?

Contributor

Is "Reader.exe" the first or last field? Maybe MV_ADD =true wasn't picked up correctly.

0 Karma
Highlighted

Re: Why is my REGEX and MV_ADD=true in transforms.conf not working as expected to extract fields from Windows event logs?

Communicator

Neither, it looks like it was picking up on this line: .

Foxit Reader.exe *\Foxit Reader  DEP SEHOP NullPage HeapSpray EAF MandatoryASLR BottomUpASLR LoadLib MemProt Caller SimExecFlow StackPivot

So by default the \s regex does not search on whitespace that is the newline character and since your provided information was just looking on the \s without having the (?m) in front, my guess is that it wasn't going to match on the newline character. So the only thing that matches that regex is the one application that has a "space" in its name.

0 Karma
Highlighted

Re: Why is my REGEX and MV_ADD=true in transforms.conf not working as expected to extract fields from Windows event logs?

Motivator

Hi fairje,

Try this regex,

(?i)(?<App_Name>\S+(?:\.exe))
0 Karma
Speak Up for Splunk Careers!

We want to better understand the impact Splunk experience and expertise has has on individuals' careers, and help highlight the growing demand for Splunk skills.