Dashboards & Visualizations

How to do event break for XML file like this?

cflam
Splunk Employee
Splunk Employee

Hi All,

I am now working on this file and tried different ways to do event break but still no luck.

Any idea will be much appreciated. I guess need to start with (\<record\>) with SHOULD_LINEMERGE=false but still no luck..

<?xml version='1.0' encoding='UTF-8'?>
<dataset>
<record><account_number>3557480924296572</account_number><amount>$585884.31</amount><date>3/26/2018</date><image_link>http://dummyimage.com/211x203.bmp/cc0000/ffffff</image_link></record><record><account_number>201965548718494</account_number><amount>$298972.83</amount><date>9/29/2017</date><image_link>http://dummyimage.com/199x142.bmp/5fa2dd/ffffff</image_link></record><record><account_number>5610279203509287</account_number><amount>$713197.00</amount><date>4/22/2018</date><image_link>http://dummyimage.com/108x170.bmp/cc0000/ffffff</image_link></record><record><account_number>3574547488776493</account_number><amount>$647684.04</amount><date>5/24/2017</date><image_link>http://dummyimage.com/228x183.bmp/5fa2dd/ffffff</image_link></record><record><account_number>4041374164494249</account_number><amount>$445816.82</amount><date>8/12/2017</date><image_link>http://dummyimage.com/222x199.png/dddddd/000000</image_link></record><record><account_number>5368744203165194</account_number><amount>$255741.87</amount><date>6/1/2017</date><image_link>http://dummyimage.com/193x177.jpg/dddddd/000000</image_link></record><record><account_number>3578475626751390</account_number><amount>$128890.58</amount><date>11/30/2017</date><image_link>http://dummyimage.com/225x155.png/cc0000/ffffff</image_link></record><record><account_number>5355532156581029</account_number><amount>$889026.01</amount><date>3/21/2018</date><image_link>http://dummyimage.com/155x205.png/ff4444/ffffff</image_link></record><record><account_number>4405300869804812</account_number><amount>$476922.41</amount><date>8/4/2017</date><image_link>http://dummyimage.com/189x195.png/ff4444/ffffff</image_link></record><record><account_number>5602239487218948</account_number><amount>$116672.03</amount><date>12/25/2017</date><image_link>http://dummyimage.com/125x211.jpg/cc0000/ffffff</image_link></record><record><account_number>5048378346595252</account_number><amount>$294451.60</amount><date>12/22/2017</date><image_link>http://dummyimage.com/146x183.bmp/dddddd/000000</image_link></record><record><account_number>3535747088584853</account_number><amount>$549026.69</amount><date>10/26/2017</date><image_link>http://dummyimage.com/214x166.bmp/ff4444/ffffff</image_link></record><record><account_number>201985433980538</account_number><amount>$399186.15</amount><date>3/24/2018</date><image_link>http://dummyimage.com/207x238.png/ff4444/ffffff</image_link></record>
0 Karma
1 Solution

xpac
SplunkTrust
SplunkTrust

Hey, I'd try this:

LINE_BREAKER = <record>
SHOULD_LINEMERGE = False

According to the props.conf doc,If no capturing group is part of the match, the linebreaker will assume that the linebreak is a zero-length break immediately preceding the match., therefore it should just start a new line (event in this case) in front of every <record>

I'd also think about an SEDCMD to remove the starting , so it doesn't clutter your index. Something along the lines of ^\s<\?xml.*[\r\n]*\s*<dataset>[\r\n]*\s* should work, but I didn't test that.

View solution in original post

0 Karma

amitm05
Builder

Can you try :
BREAK_ONLY_BEFORE = (.*)

Basically you are trying to break the event at the identification of this tag only and nothing else. I suppose this should do it.
Please let me know if that works.

xpac
SplunkTrust
SplunkTrust

Hey, I'd try this:

LINE_BREAKER = <record>
SHOULD_LINEMERGE = False

According to the props.conf doc,If no capturing group is part of the match, the linebreaker will assume that the linebreak is a zero-length break immediately preceding the match., therefore it should just start a new line (event in this case) in front of every <record>

I'd also think about an SEDCMD to remove the starting , so it doesn't clutter your index. Something along the lines of ^\s<\?xml.*[\r\n]*\s*<dataset>[\r\n]*\s* should work, but I didn't test that.

0 Karma
Get Updates on the Splunk Community!

What's New in Splunk Cloud Platform 9.2.2403?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.2.2403! Analysts can ...

Stay Connected: Your Guide to July and August Tech Talks, Office Hours, and Webinars!

Dive into our sizzling summer lineup for July and August Community Office Hours and Tech Talks. Scroll down to ...

Edge Processor Scaling, Energy & Manufacturing Use Cases, and More New Articles on ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...