Dashboards & Visualizations

How to do event break for XML file like this?

cflam
Splunk Employee
Splunk Employee

Hi All,

I am now working on this file and tried different ways to do event break but still no luck.

Any idea will be much appreciated. I guess need to start with (\<record\>) with SHOULD_LINEMERGE=false but still no luck..

<?xml version='1.0' encoding='UTF-8'?>
<dataset>
<record><account_number>3557480924296572</account_number><amount>$585884.31</amount><date>3/26/2018</date><image_link>http://dummyimage.com/211x203.bmp/cc0000/ffffff</image_link></record><record><account_number>201965548718494</account_number><amount>$298972.83</amount><date>9/29/2017</date><image_link>http://dummyimage.com/199x142.bmp/5fa2dd/ffffff</image_link></record><record><account_number>5610279203509287</account_number><amount>$713197.00</amount><date>4/22/2018</date><image_link>http://dummyimage.com/108x170.bmp/cc0000/ffffff</image_link></record><record><account_number>3574547488776493</account_number><amount>$647684.04</amount><date>5/24/2017</date><image_link>http://dummyimage.com/228x183.bmp/5fa2dd/ffffff</image_link></record><record><account_number>4041374164494249</account_number><amount>$445816.82</amount><date>8/12/2017</date><image_link>http://dummyimage.com/222x199.png/dddddd/000000</image_link></record><record><account_number>5368744203165194</account_number><amount>$255741.87</amount><date>6/1/2017</date><image_link>http://dummyimage.com/193x177.jpg/dddddd/000000</image_link></record><record><account_number>3578475626751390</account_number><amount>$128890.58</amount><date>11/30/2017</date><image_link>http://dummyimage.com/225x155.png/cc0000/ffffff</image_link></record><record><account_number>5355532156581029</account_number><amount>$889026.01</amount><date>3/21/2018</date><image_link>http://dummyimage.com/155x205.png/ff4444/ffffff</image_link></record><record><account_number>4405300869804812</account_number><amount>$476922.41</amount><date>8/4/2017</date><image_link>http://dummyimage.com/189x195.png/ff4444/ffffff</image_link></record><record><account_number>5602239487218948</account_number><amount>$116672.03</amount><date>12/25/2017</date><image_link>http://dummyimage.com/125x211.jpg/cc0000/ffffff</image_link></record><record><account_number>5048378346595252</account_number><amount>$294451.60</amount><date>12/22/2017</date><image_link>http://dummyimage.com/146x183.bmp/dddddd/000000</image_link></record><record><account_number>3535747088584853</account_number><amount>$549026.69</amount><date>10/26/2017</date><image_link>http://dummyimage.com/214x166.bmp/ff4444/ffffff</image_link></record><record><account_number>201985433980538</account_number><amount>$399186.15</amount><date>3/24/2018</date><image_link>http://dummyimage.com/207x238.png/ff4444/ffffff</image_link></record>
0 Karma
1 Solution

xpac
SplunkTrust
SplunkTrust

Hey, I'd try this:

LINE_BREAKER = <record>
SHOULD_LINEMERGE = False

According to the props.conf doc,If no capturing group is part of the match, the linebreaker will assume that the linebreak is a zero-length break immediately preceding the match., therefore it should just start a new line (event in this case) in front of every <record>

I'd also think about an SEDCMD to remove the starting , so it doesn't clutter your index. Something along the lines of ^\s<\?xml.*[\r\n]*\s*<dataset>[\r\n]*\s* should work, but I didn't test that.

View solution in original post

0 Karma

amitm05
Builder

Can you try :
BREAK_ONLY_BEFORE = (.*)

Basically you are trying to break the event at the identification of this tag only and nothing else. I suppose this should do it.
Please let me know if that works.

xpac
SplunkTrust
SplunkTrust

Hey, I'd try this:

LINE_BREAKER = <record>
SHOULD_LINEMERGE = False

According to the props.conf doc,If no capturing group is part of the match, the linebreaker will assume that the linebreak is a zero-length break immediately preceding the match., therefore it should just start a new line (event in this case) in front of every <record>

I'd also think about an SEDCMD to remove the starting , so it doesn't clutter your index. Something along the lines of ^\s<\?xml.*[\r\n]*\s*<dataset>[\r\n]*\s* should work, but I didn't test that.

0 Karma
Get Updates on the Splunk Community!

Mastering Data Pipelines: Unlocking Value with Splunk

 In today's AI-driven world, organizations must balance the challenges of managing the explosion of data with ...

Splunk Up Your Game: Why It's Time to Embrace Python 3.9+ and OpenSSL 3.0

Did you know that for Splunk Enterprise 9.4, Python 3.9 is the default interpreter? This shift is not just a ...

See your relevant APM services, dashboards, and alerts in one place with the updated ...

As a Splunk Observability user, you have a lot of data you have to manage, prioritize, and troubleshoot on a ...