All Apps and Add-ons

Sysmon Add-on for Linux unable to extract Data attribute from XML

att35
Builder

Hi,

We are testing Splunk Add-on for Sysmon for Linux to ingest Sysmon data from Linux systems. Data ingestion and majority of the extractions are working fine, except the Data part.

 

<Data Name="FieldName">

 

 It appears that Splunk is completely skips over this.

We have Sysmon for Windows working as well and same attribute gets extracted just fine. Data format between Sysmon from Linux Vs Windows is identical, so are the transform stanza's in the TA's. Only difference I could see is that the field name in Windows is enclosed in single quotes where for Linux it is double quotes. Could this be causing the regex in TA to not work for Data ? Including some examples here. 

Sample Data from Linux Sysmon

 

<Event><System><Provider Name="Linux-Sysmon" Guid="{ff032593-a8d3-4f13-b0d6-01fc615a0f97}"/><EventID>3</EventID><Version>5</Version><Level>4</Level><Task>3</Task><Opcode>0</Opcode><Keywords>0x8000000000000000</Keywords><TimeCreated SystemTime="2023-11-13T13:34:45.693615000Z"/><EventRecordID>140108</EventRecordID><Correlation/><Execution ProcessID="24493" ThreadID="24493"/><Channel>Linux-Sysmon/Operational</Channel><Computer>computername</Computer><Security UserId="0"/></System><EventData><Data Name="RuleName">-</Data><Data Name="UtcTime">2023-11-13 13:34:45.697</Data><Data Name="ProcessGuid">{ba131d2e-2a52-6550-285f-207366550000}</Data><Data Name="ProcessId">64284</Data><Data Name="Image">/opt/splunkforwarder/bin/splunkd</Data><Data Name="User">root</Data><Data Name="Protocol">tcp</Data><Data Name="Initiated">true</Data><Data Name="SourceIsIpv6">false</Data><Data Name="SourceIp">x.x.x.x</Data><Data Name="SourceHostname">-</Data><Data Name="SourcePort">60164</Data><Data Name="SourcePortName">-</Data><Data Name="DestinationIsIpv6">false</Data><Data Name="DestinationIp">x.x.x.x</Data><Data Name="DestinationHostname">-</Data><Data Name="DestinationPort">8089</Data><Data Name="DestinationPortName">-</Data></EventData></Event>

 

Sample data from Windows Sysmon

 

<Event xmlns='http://schemas.microsoft.com/win/2004/08/events/event'><System><Provider Name='Microsoft-Windows-Sysmon' Guid='{5770385f-c22a-43e0-bf4c-06f5698ffbd9}'/><EventID>3</EventID><Version>5</Version><Level>4</Level><Task>3</Task><Opcode>0</Opcode><Keywords>0x8000000000000000</Keywords><TimeCreated SystemTime='2023-11-13T13:26:31.064124600Z'/><EventRecordID>1571173614</EventRecordID><Correlation/><Execution ProcessID='2988' ThreadID='5720'/><Channel>Microsoft-Windows-Sysmon/Operational</Channel><Computer>computername</Computer><Security UserID='S-1-5-18'/></System><EventData><Data Name='RuleName'>-</Data><Data Name='UtcTime'>2023-11-13 13:26:13.591</Data><Data Name='ProcessGuid'>{f4558f15-1db6-654f-8400-000000007a00}</Data><Data Name='ProcessId'>4320</Data><Data Name='Image'>C:\..\..\image.exe</Data><Data Name='User'>NT AUTHORITY\SYSTEM</Data><Data Name='Protocol'>tcp</Data><Data Name='Initiated'>true</Data><Data Name='SourceIsIpv6'>false</Data><Data Name='SourceIp'>127.0.0.1</Data><Data Name='SourceHostname'>computername</Data><Data Name='SourcePort'>64049</Data><Data Name='SourcePortName'>-</Data><Data Name='DestinationIsIpv6'>false</Data><Data Name='DestinationIp'>127.0.0.1</Data><Data Name='DestinationHostname'>computername</Data><Data Name='DestinationPort'>4932</Data><Data Name='DestinationPortName'>-</Data></EventData></Event>

 

Transforms on both sides are also identical except the difference for single Vs double quotes.

 

Linux

[sysmon-data]
REGEX = <Data Name="(.*?)">(.*?)</Data>
FORMAT = $1::$2

Windows

[sysmon-data]
REGEX = <Data Name='(.*?)'>(.*?)</Data>
FORMAT = $1::$2

 

 Any clues on what could be causing Splunk to not extract Data attribute for Linux? Transforms for other elements such as Computer, Keywords are working fine, it just skips this Data part completely.

Thanks,

Labels (1)
Tags (3)

ta1
Explorer

Hey not sure if you managed to figure this out ?

 

I'm having the same issue and I can't get it to work despite trying a few ways. 

 

Thanks. 

0 Karma

PickleRick
SplunkTrust
SplunkTrust

The regex seems ok. See https://regex101.com/r/4NN0K1/1

The question is - since I'm not familiar with inner workings of the addon - are those transforms really both defined under the same name? They won't work both at the same time then - one will overwrite the other in the effective config according to the precedence rules.

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Index This | What travels the world but is also stuck in place?

April 2026 Edition  Hayyy Splunk Education Enthusiasts and the Eternally Curious!   We’re back with this ...

Discover New Use Cases: Unlock Greater Value from Your Existing Splunk Data

Realizing the full potential of your Splunk investment requires more than just understanding current usage; it ...

Continue Your Journey: Join Session 2 of the Data Management and Federation Bootcamp ...

As data volumes continue to grow and environments become more distributed, managing and optimizing data ...