Getting Data In

Extract and Transform Custom Event

jagadeeshm
Contributor

I have events with the following format -

[Thread-2505_GOOGLE_INT_20170424155901301f9e61-1493049600619-NSRLM_2_1_RTL_39088504_2_R_PCLN,PCLN] 2017-04-24 12:00:02 : T:0.047 secs S:XXXX-SSSS-SSSSSS A:Availability M:INT_20170424155901301f9e61-1493049600619-NSRLM_2_1_RTL_39088504_2_R_1 CMD: [<?xml version="1.0" ?><AvailabilityRequestV2 xmlns="http://xml.google.com" siteid="1470249" apikey="SFGSDGSDFSDFGSFG" async="false" waittime="5"><Type>4</Type><Id>460573</Id><Radius>0</Radius><Latitude>0.0</Latitude><Longitude>0.0</Longitude><CheckIn>2017-10-01</CheckIn><CheckOut>2017-10-17</CheckOut><Rooms>1</Rooms><Adults>2</Adults><Children>0</Children><Language>en-us</Language><Currency>000</Currency></AvailabilityRequestV2>]

Notice couple of things - event starts with a thread name in square brackets, followed by date/time, followed by an unwanted ":", followed by several key/value pairs separated by a ":", and finally at the end you have content in XML inside square brackets.

I want to extract/transform this into the following so it is easy to search and run spath to extract fields from xml -

 Thread = Thread-2505_GOOGLE_INT_20170424155901301f9e61-1493049600619-NSRLM_2_1_RTL_39088504_2_R_PCLN,PCLN
DateTime = 2017-04-24 12:00:02
ResponseTime = 0.047 secs 
S = XXXX-SSSS-SSSSSS 
A = Availability 
M = INT_20170424155901301f9e61-1493049600619-NSRLM_2_1_RTL_39088504_2_R_1 
Action = CMD
XML = <?xml version="1.0" ?><AvailabilityRequestV2 xmlns="http://xml.google.com" siteid="1470249" apikey="SFGSDGSDFSDFGSFG" async="false" waittime="5"><Type>4</Type><Id>460573</Id><Radius>0</Radius><Latitude>0.0</Latitude><Longitude>0.0</Longitude><CheckIn>2017-10-01</CheckIn><CheckOut>2017-10-17</CheckOut><Rooms>1</Rooms><Adults>2</Adults><Children>0</Children><Language>en-us</Language><Currency>000</Currency></AvailabilityRequestV2>

Notice that the last key CMD is put into a new key "Action" and value is put inside XML. I was able to do some clean up like removing square brackets, removing extra ":" etc with the following SED Command -

SEDCMD-remove-xml-header = s/\<\?xml[^\>]*\>//g
SEDCMD-remove-square-brackets = s/\[|\]//g
SEDCMD-remove-colon = s/ : / /g
SECCMD-reame-threadname = s/Thread-/Thread:/1

How do I achieve the rest? Is a key value transformer?

0 Karma
1 Solution

beatus
Communicator

Here's a regex that should work for you:
\[(?<Thread>.*?)\]\s(?<TimeStamp>(\d{4})-(\d{2})-(\d{2}) (\d{2}):(\d{2}):(\d{2})?)\s:\s(?:T:(?< ResponseTime>\S+\s+\S+))?(.)*(?<Action>(\w{3})):\s\[(?<XML>.*?)\]

This should handle the timeouts either existing in the log or not and only create the "timeout" field when they are there. Hope this helps!

View solution in original post

0 Karma

beatus
Communicator

Here's a regex that should work for you:
\[(?<Thread>.*?)\]\s(?<TimeStamp>(\d{4})-(\d{2})-(\d{2}) (\d{2}):(\d{2}):(\d{2})?)\s:\s(?:T:(?< ResponseTime>\S+\s+\S+))?(.)*(?<Action>(\w{3})):\s\[(?<XML>.*?)\]

This should handle the timeouts either existing in the log or not and only create the "timeout" field when they are there. Hope this helps!

0 Karma

jagadeeshm
Contributor

Thanks beatus 🙂

0 Karma

jagadeeshm
Contributor

Any splunk transformers 🙂 around?

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Why Splunk Customers Should Attend Cisco Live 2026 Las Vegas

Why Splunk Customers Should Attend Cisco Live 2026 Las Vegas     Cisco Live 2026 is almost here, and this ...

What Is the Name of the USB Key Inserted by Bob Smith? (BOTS Hint, Not the Answer)

Hello Splunkers,   So you searched, “what is the name of the usb key inserted by bob smith?”  Not gonna lie… ...

Automating Threat Operations and Threat Hunting with Recorded Future

    Automating Threat Operations and Threat Hunting with Recorded Future June 29, 2026 | Register   Is your ...