I am trying to parse out the EMET (Enhanced Mitigation Experience Toolkit) logs (note when I get this whole thing working, I plan to share this far and wide so MS will stop trying to sell you on their crappy products to monitor these same logs). In any case, we currently have the GPO/Registry configurations being kicked to EventCode 50 and they look something similar to below:
01/12/2016 05:00:05 PM
LogName=Application
SourceName=EMET
EventCode=50
EventType=4
Type=Information
ComputerName=host001.com
TaskCategory=%1
OpCode=Info
RecordNumber=267548
Keywords=Classic
Message=EMET settings were refreshed successfully.
EMET configuration for Application mitigations (Registry) is:
<ConfigAppmitREG>
</ConfigAppmitREG>
EMET configuration for Application mitigations (GPO) is:
<ConfigAppmitGPO>
7z.exe *\7-Zip DEP SEHOP NullPage HeapSpray MandatoryASLR BottomUpASLR LoadLib MemProt Caller SimExecFlow StackPivot
7zFM.exe *\7-Zip DEP SEHOP NullPage HeapSpray MandatoryASLR BottomUpASLR LoadLib MemProt Caller SimExecFlow StackPivot
7zG.exe *\7-Zip DEP SEHOP NullPage HeapSpray MandatoryASLR BottomUpASLR LoadLib MemProt Caller SimExecFlow StackPivot
Acrobat.exe *\Adobe\Acrobat*\Acrobat DEP SEHOP NullPage HeapSpray EAF MandatoryASLR BottomUpASLR LoadLib Caller SimExecFlow StackPivot
AcroRd32.exe *\Adobe\Reader*\Reader DEP SEHOP NullPage HeapSpray EAF MandatoryASLR BottomUpASLR LoadLib Caller SimExecFlow StackPivot
chrome.exe *\Google\Chrome\Application DEP NullPage HeapSpray EAF MandatoryASLR BottomUpASLR LoadLib MemProt Caller SimExecFlow StackPivot
communicator.exe *\Microsoft Lync DEP SEHOP NullPage HeapSpray EAF MandatoryASLR BottomUpASLR LoadLib MemProt Caller SimExecFlow StackPivot
EXCEL.EXE *\OFFICE1* DEP SEHOP NullPage HeapSpray EAF MandatoryASLR BottomUpASLR LoadLib MemProt Caller SimExecFlow StackPivot
firefox.exe *\Mozilla Firefox DEP SEHOP NullPage HeapSpray EAF MandatoryASLR BottomUpASLR LoadLib MemProt Caller SimExecFlow StackPivot
...snip more apps...
</ConfigAppmitGPO>
There are a couple other events generated from EMET on Event 50, but this is the important one because it tells you how you are getting certain settings (registry keys or GPO) and it also tells you what your hosts are configured as (in case you have different configs in your environment for different reasons).
Now here is the nightmare. How to extract the REGEX statement on your transforms to parse all this information out. So to start with, I was toying around with the rex
search command and got success pulling out all the application names as such:
| rex max_match=0 field=Message "(?m)^(?<App_Name>.*\.[exeEXE]{3})"
I am using the fact that the Message field is already pulled by Splunk having the Windows TA installed and it's general =
extractions. Mostly that gives me everything after Message=
in the logs. The regex above actually works to pull out (especially with max_match at 0 - unlimited) all the app names in a single event. When I tried to throw that in transforms.conf, it all falls apart and just doesn't work with no apparent reason why not.
[emet_event50_app_from_Message]
SOURCE_KEY = Message
REGEX = (?m)^(?<App_Name>.*\.[exeEXE]{3})
MV_ADD = true
Essentially the MV_ADD should make it pull all the matches, not just the first one. But instead, the results I get is a regrab of the entire message data e.g.:
EMET settings were refreshed successfully. EMET configuration for Application mitigations (Registry) is: <ConfigAppmitREG> </ConfigAppmitREG> EMET configuration for Application mitigations (GPO) is: <ConfigAppmitGPO> 7z.exe *\7-Zip DEP SEHOP NullPage HeapSpray MandatoryASLR BottomUpASLR LoadLib MemProt Caller SimExecFlow StackPivot 7zFM.exe *\7-Zip DEP SEHOP NullPage HeapSpray MandatoryASLR BottomUpASLR LoadLib MemProt Caller... and so on
I have never really tried working with a multiline event in Splunk from the transforms file before, so I am not sure what I am missing here. And reading other Splunk Answers seems to indicate that the above should be right, but it just isn't working.
Thanks for the assist!
So I have worked around the issue with the following:
[emet_event50_app_from_Message]
SOURCE_KEY = Message
REGEX = \n(?<App_Name>.*\.[exeEXE]{3})\s\S
MV_ADD = true
Clearly you can see that the newline character is there, because this totally works for the logs, but it doesn't want to accept the (?m) option in the front so you can actually switch to using the the caret "^" character. This is frustrating because in other logs I have used the (?m) option.
As has been suggested it may have something to do with the way splunk is extracting the "Message" field. I haven't tried an extraction in the transforms using _raw, maybe that would also be a solution.
Note about the above regex, I have to use the * character on this to work correctly, since some application names have whitespace in them along with words. As long as you anchor to the newline and stop when it finds "exe" or "EXE" then that should be sufficient for grabbing this data on EMET logs.
Thank you gcato for the assistance on getting to the bottom of this. Your responses were appreciated!
So I have worked around the issue with the following:
[emet_event50_app_from_Message]
SOURCE_KEY = Message
REGEX = \n(?<App_Name>.*\.[exeEXE]{3})\s\S
MV_ADD = true
Clearly you can see that the newline character is there, because this totally works for the logs, but it doesn't want to accept the (?m) option in the front so you can actually switch to using the the caret "^" character. This is frustrating because in other logs I have used the (?m) option.
As has been suggested it may have something to do with the way splunk is extracting the "Message" field. I haven't tried an extraction in the transforms using _raw, maybe that would also be a solution.
Note about the above regex, I have to use the * character on this to work correctly, since some application names have whitespace in them along with words. As long as you anchor to the newline and stop when it finds "exe" or "EXE" then that should be sufficient for grabbing this data on EMET logs.
Thank you gcato for the assistance on getting to the bottom of this. Your responses were appreciated!
Good result fairje. It is strange and I wonder if there is a bug here.
I found an old comment by "itinney" here: https://answers.splunk.com/answers/38753/regex-for-multiline-events.html
He indicates that uses (?m) seems to behave like using (?sm), i.e. (?s) gets tuned on if (?m) is used. Note, I've not proved this but it would be strange behaviour as it defeats the purpose of using (?m) which is to cause ^ and $ to match the begin/end of each line (not only begin/end of string). Something to watch out for anyway.
Hi fairje,
Try this regex,
(?i)(?<App_Name>\S+(?:\.exe))
Hi fairje,
It would appear that the newlines in your Message
field are no longer there (i.e. it is not multiline anymore, but one long string) so your regex no longer works. Or at least the logic no longer works. It matches from the Message
string beginning to the last .exe it finds and that's what you see returned.
To fix you need to use a different REGEX
. This is a perfect place to use regex's lookahead (?=...) syntax. Try using the following REGEX
which should find all .exe files in the string (assuming no whitespace in file names).
REGEX = \s(?<App_Name>\w*(?=\.[exeEXE]{3}( |\z))\.[exeEXE]{3})
I tried this at regex101 and it works on your example data. You can find it here if you want to check what the regex syntax means: https://regex101.com/r/hO9iD8/2
This is also a great regex resource if you get stuck: http://www.rexegg.com/regex-lookarounds.html
Hope this helps.
Is "Reader.exe" the first or last field? Maybe MV_ADD =true
wasn't picked up correctly.
Neither, it looks like it was picking up on this line: .
Foxit Reader.exe *\Foxit Reader DEP SEHOP NullPage HeapSpray EAF MandatoryASLR BottomUpASLR LoadLib MemProt Caller SimExecFlow StackPivot
So by default the \s regex does not search on whitespace that is the newline character and since your provided information was just looking on the \s without having the (?m) in front, my guess is that it wasn't going to match on the newline character. So the only thing that matches that regex is the one application that has a "space" in its name.
Hrmmm, doesn't seem to work as expected. I now only have one extraction which is "Reader.exe" from the events 😞
I'm going to try changing it up from a \s to a \n (for new line) and see if that works since it has worked elsewhere in other events.
Hi fairje,
Did you manage to get the extraction working okay? It would be good to know if the answer worked so it may be useful for other users.
Sorry for the delay getting back. I am reloading my configuration now and will post back when I get more.
I'm confused though why this doesn't read the newline character... I might try what I did in another REGEX on the same logs as well, which looked like this:
REGEX = (?:\nEMET configuration for |\nEMET )(?<EMETEvent50Type>(?:\w+ status|\w+ Trust|\w+)) (?:is|mitigations)
Note that REGEX does work on these same exact logs. Since in the above log example I provided it would extract:
EMETEvent50Type = "Application"
It's strange that the (?m) doesn't work, when I totally use that in another transforms on a different file. And I think I have either used the (?m) or the (?s) option on a different windows event log before... ::confused::
Hmmm... does this one use the Message
field as SOURCE_KEY
though or _raw data (default)? I guess it comes back to how the Message field is auto extracted by Splunk. If it's something like (?ms)Message=(?.+)
then newlines becomes dots and Message is a single line field. Though the rex
search command example in you question indicates newlines are in the Message field. How does the message field appear when you table it i.e. ...search ... | table Message
Maybe something like would cover both options.
REGEX = (?:^|\s)(?<App_Name>\w*(?=\.[exeEXE]{3})\.[exeEXE]{3})(?: |\z)+
Try it without (?m) at the start also
By the way, both a ... | table _raw and a ... | table Message returns the same unformatted text stripping away any newline characters. So unfortunately that doesn't tell me anything about what is going on in the background...