Hi community,
The following mod=sed regex works as expected, but when I attempted on the system/local/props.conf on the indexers it fails to trim as tested via | make results
| makeresults
| eval _raw="<Event xmlns='http://schemas.microsoft.com/win/2004/08/events/event'><System><Provider Name='Microsoft-Windows-Security-Auditing' Guid='{54849625-5478-4994-a5ba-3e3bxxxxxx}'/><EventID>4627</EventID><Version>0</Version><Level>0</Level><Task>12554</Task><Opcode>0</Opcode><Keywords>0x8020000000000000</Keywords><TimeCreated SystemTime='2024-11-27T11:27:45.6695363Z'/><EventRecordID>2177113</EventRecordID><Correlation ActivityID='{01491b93-40a4-0002-6926-4901a440db01}'/><Execution ProcessID='1196' ThreadID='1312'/><Channel>Security</Channel><Computer>Computer1</Computer><Security/></System><EventData><Data Name='SubjectUserSid'>S-1-5-18</Data><Data Name='SubjectUserName'>CXXXXXX</Data><Data Name='SubjectDomainName'>CXXXXXXXX</Data><Data Name='SubjectLogonId'>0x3e7</Data><Data Name='TargetUserSid'>S-1-5-18</Data><Data Name='TargetUserName'>SYSTEM</Data><Data Name='TargetDomainName'>NT AUTHORITY</Data><Data Name='TargetLogonId'>0x3e7</Data><Data Name='LogonType'>5</Data><Data Name='EventIdx'>1</Data><Data Name='EventCountTotal'>1</Data><Data Name='GroupMembership'>
%{S-1-5-32-544}
%{S-1-1-0}
%{S-1-5-11}
%{S-1-16-16384}</Data></EventData></Event>"
| rex mode=sed "s/(?s).*<Event[^>]*>.*?<EventID>4627<\/EventID>.*?<TimeCreated SystemTime='([^']*)'.*?<Computer>([^<]*)<\/Computer>.*?<Data Name='SubjectUserName'>([^<]*)<\/Data>.*?<Data Name='SubjectDomainName'>([^<]*)<\/Data>.*?<Data Name='TargetUserName'>([^<]*)<\/Data>.*?<Data Name='TargetDomainName'>([^<]*)<\/Data>.*?<Data Name='LogonType'>([^<]*)<\/Data>.*?<\/Event>.*/EventID:4627 TimeCreated:\\1 Computer:\\2 SubjectUserName:\\3 SubjectDomainName:\\4 TargetUserName:\\5 TargetDomainName:\\6 LogonType:\\7/g"
----------------------------------
[XmlWinEventLog: Security]
SEDCMD-reduce_4627 = s/(?s).*<Event[^>]*>.*?<EventID>4627<\/EventID>.*?<TimeCreated SystemTime='([^']*)'.*?<Computer>([^<]*)<\/Computer>.*?<Data Name='SubjectUserName'>([^<]*)<\/Data>.*?<Data Name='SubjectDomainName'>([^<]*)<\/Data>.*?<Data Name='TargetUserName'>([^<]*)<\/Data>.*?<Data Name='TargetDomainName'>([^<]*)<\/Data>.*?<Data Name='LogonType'>([^<]*)<\/Data>.*?<\/Event>.*/EventID:4627 TimeCreated:\1 Computer:\2 SubjectUserName:\3 SubjectDomainName:\4 TargetUserName:\5 TargetDomainName:\6 LogonType:\7/g
Can anyone help me identify where the problem is, please?
Thank you.
Did you used renderXML=true in inputs.conf?
You need Splunk TA for windows installed on indexers to view windows events in xml format.
In addition to what @PickleRick wrote, what problem are you trying to solve? Splunk is quite capable of parsing XML logs so why are trying to re-format them? Why not use a transform to extract the fields directly instead of the interim SEDCMD step?
I suppose it's another attempt at reducing the size of the logs while maintaining the events as such but cutting the unnecessary parts from them.
We don't know the whole picture but from the partial info I can guest that - assuming the regex and the substitution pattern are OK, there are two obvious things which might be wrong.
1) You put your settings on a wrong Splunk component (i.e. you're trying to put them on the indexer when your data is going through a HF earlier) and/or
2) You're binding the SEDCMD in a wrong stanza. I see that you're using XmlWinEventLog: Security - this is the long gone naming convention and it hasn't been in use for several years now. Now all windows events are of sourcetype XmlWinEventLog and the source field differentiates between the originating event log.
As a side note - it's a good practice to avoid writing to system/local - use apps to group your settings so that you can later easily manage them, overwrite, inherit and so on.
Hi @PickleRick,
Thank you for the clarification and yes you are correct I am addressing the same issue.
Here's the updated response that reflects the correct sequence of events:
1. Component Placement
The Universal Forwarder (UF) is responsible only for collecting and forwarding data and does not perform parsing or transformations. SEDCMD settings in props.conf must therefore be applied on the indexers, where parsing occurs. Since there are no Heavy Forwarders in the architecture, the indexers were the correct location for these configurations.
2. Stanza Naming and Testing
I confirm that the XmlWinEventLog: Security stanza was the correct choice for this configuration. Each SEDCMD was tested separately in this stanza:
The first SEDCMD partially worked, applying some transformations but not entirely meeting the expected output.
The second SEDCMD, tested independently, caused Event ID 4627 to stop being indexed altogether.
These results confirm that XmlWinEventLog: Security is the appropriate naming convention, as the configuration was correctly recognised and applied. Additionally, I tested other stanzas, including WinEventLog: Security, and none worked as intended, further validating that XmlWinEventLog: Security is the correct stanza to use
3. Configuration Location
For quick validation during testing, the configurations were initially placed in system/local. For production deployment, they have been moved into dedicated apps, ensuring better organisation, ease of updates, and compliance with Splunk’s best practices.
4. Regex Validation
Both SEDCMD regex directives were validated using | makeresults with the raw event data. The partial success of the first and the indexing failure of the second highlight that the regex logic itself or environmental factors need adjustment for consistent application in production
I hope this clears up any concerns and confirms the steps taken during testing and deployment. Let me know if there’s anything else you’d like me to elaborate to be able to resolve the issue
Best regards,
Dan
1. 👍
2. Honestly, that's surprising. Normally the events are ingested as either WinEventLog or XmlWinEventLog. See https://docs.splunk.com/Documentation/AddOns/released/Windows/SourcetypesandCIMdatamodelinfo
The naming where you used the channel name in the sourcetype was used in old versions of TA_windows as far as I know. But for ages now it's deprecated and TA_windows does a rewrite to the normalized version.
Anyway, there is one more thing worth taking into consideration - You're rewriting your event data into a completely different format. So the normal TA_windows extractions won't work. You might recast the events into another sourcetype but then you'd have to adjust all CIM-mappings and such to make this sourcetype properly working with stuff like ES.
Honestly, I'd go for preprocessing this with some external tool before ingestion and try to retain the original format while cutting "unnecessary" data.