I have the following stansas deployed to lightweight forwarders running Windows:
TRANSFORMS-clean = windows-evtlog-sec-clean
REGEX = ^(?ims)(.*[\r\n]+)[\r\n]+(This event |Note: ).+
FORMAT = $1
DEST_KEY = _raw
to get rid of all of that lovely "This event is logged when..." text that the Microsoft APIs like to throw on every event. Works fairly well on lightweight forwarders.
When I'm getting data from the Splunk Universal Forwarders though, these stanzas are obviously ignored- so I added the same stanzas to my indexer and expected them to get picked up there. Not so much. Is it not possible to rewrite the _raw data collected and sent from a universal forwarder?
The Universal Forwarder has no python, and is not parsing the events.
All the event transformation has to occur on the indexer (or heavy forwarder if any)
Please move your props and transforms on the indexer, and all should be fine.
Sorry I missed this part.
So the issue may be regex failing, did you tested it on splunk search, on the sourcetype with the rex command ?
Here is another possibility :
use sed in props instead of regex in transforms
SEDCMD-cleanwindows = s/[rn]+(This event |Note: ).+//g
What you are trying to do is valid but it also did not work in my sandbox. From square one and understanding you just want to remove the comment line on the event, I tested your regex with a few regex tools with generic data. That regex definition did not capture the desired data in various sample events.
The following works:
[windows-evtlog-sec-clean] REGEX = ^(?ims)(.*[\r\n]+)?(?:(?:This event|Note\:).*$) FORMAT = $1 DEST_KEY = _raw
In retrospect, this is an expensive operation. You are asking the Splunk Indexer to interpret each event from the Windows Event Log for Security and rewrite it. Test it and ensure you examine the performance of your indexer in relationship to the regex function and the rewrite function. If you find a negative effect in the performance caused by this exercise, it may be better off-loaded to a Light Forwarder (as opossed to a UF, which carries its own set of trade-offs).
Thanks... I've actually been doing this for years on each Windows server running a lightweight forwarder and already determined that the performance impact is negligible. I'm hoping to switch over to universal forwarder across the board (as much as possible) and this is one of the few sticking points for me. The indexers are over-scaled for our deployment so I'm not anticipating performance issues by tasking the indexers with this task, if I can get it to work.
The REGEX works fine, it's the markdown in Splunk Answers that had a problem. I updated my original post with the working REGEX...
Turns out the issue I had was with a bad line break in my props.conf above the pated WinEventLog:Security stanza. Splunk stopped parsing the conf file after that, apparently. When I corrected that my original solution worked.
Wish I could accept both answers since both provided (more or less) accurate info... I accepted yannK's answer though since using SEDCMD seems more deliberately designed for modifying _raw pre-indexing. Added the appropriate \ in the search for carriage-return newline ( [\r\n]+ ).