Dashboards & Visualizations

How to do event break for XML file like this?

cflam
Splunk Employee
Splunk Employee

Hi All,

I am now working on this file and tried different ways to do event break but still no luck.

Any idea will be much appreciated. I guess need to start with (\<record\>) with SHOULD_LINEMERGE=false but still no luck..

<?xml version='1.0' encoding='UTF-8'?>
<dataset>
<record><account_number>3557480924296572</account_number><amount>$585884.31</amount><date>3/26/2018</date><image_link>http://dummyimage.com/211x203.bmp/cc0000/ffffff</image_link></record><record><account_number>201965548718494</account_number><amount>$298972.83</amount><date>9/29/2017</date><image_link>http://dummyimage.com/199x142.bmp/5fa2dd/ffffff</image_link></record><record><account_number>5610279203509287</account_number><amount>$713197.00</amount><date>4/22/2018</date><image_link>http://dummyimage.com/108x170.bmp/cc0000/ffffff</image_link></record><record><account_number>3574547488776493</account_number><amount>$647684.04</amount><date>5/24/2017</date><image_link>http://dummyimage.com/228x183.bmp/5fa2dd/ffffff</image_link></record><record><account_number>4041374164494249</account_number><amount>$445816.82</amount><date>8/12/2017</date><image_link>http://dummyimage.com/222x199.png/dddddd/000000</image_link></record><record><account_number>5368744203165194</account_number><amount>$255741.87</amount><date>6/1/2017</date><image_link>http://dummyimage.com/193x177.jpg/dddddd/000000</image_link></record><record><account_number>3578475626751390</account_number><amount>$128890.58</amount><date>11/30/2017</date><image_link>http://dummyimage.com/225x155.png/cc0000/ffffff</image_link></record><record><account_number>5355532156581029</account_number><amount>$889026.01</amount><date>3/21/2018</date><image_link>http://dummyimage.com/155x205.png/ff4444/ffffff</image_link></record><record><account_number>4405300869804812</account_number><amount>$476922.41</amount><date>8/4/2017</date><image_link>http://dummyimage.com/189x195.png/ff4444/ffffff</image_link></record><record><account_number>5602239487218948</account_number><amount>$116672.03</amount><date>12/25/2017</date><image_link>http://dummyimage.com/125x211.jpg/cc0000/ffffff</image_link></record><record><account_number>5048378346595252</account_number><amount>$294451.60</amount><date>12/22/2017</date><image_link>http://dummyimage.com/146x183.bmp/dddddd/000000</image_link></record><record><account_number>3535747088584853</account_number><amount>$549026.69</amount><date>10/26/2017</date><image_link>http://dummyimage.com/214x166.bmp/ff4444/ffffff</image_link></record><record><account_number>201985433980538</account_number><amount>$399186.15</amount><date>3/24/2018</date><image_link>http://dummyimage.com/207x238.png/ff4444/ffffff</image_link></record>
0 Karma
1 Solution

xpac
SplunkTrust
SplunkTrust

Hey, I'd try this:

LINE_BREAKER = <record>
SHOULD_LINEMERGE = False

According to the props.conf doc,If no capturing group is part of the match, the linebreaker will assume that the linebreak is a zero-length break immediately preceding the match., therefore it should just start a new line (event in this case) in front of every <record>

I'd also think about an SEDCMD to remove the starting , so it doesn't clutter your index. Something along the lines of ^\s<\?xml.*[\r\n]*\s*<dataset>[\r\n]*\s* should work, but I didn't test that.

View solution in original post

0 Karma

amitm05
Builder

Can you try :
BREAK_ONLY_BEFORE = (.*)

Basically you are trying to break the event at the identification of this tag only and nothing else. I suppose this should do it.
Please let me know if that works.

xpac
SplunkTrust
SplunkTrust

Hey, I'd try this:

LINE_BREAKER = <record>
SHOULD_LINEMERGE = False

According to the props.conf doc,If no capturing group is part of the match, the linebreaker will assume that the linebreak is a zero-length break immediately preceding the match., therefore it should just start a new line (event in this case) in front of every <record>

I'd also think about an SEDCMD to remove the starting , so it doesn't clutter your index. Something along the lines of ^\s<\?xml.*[\r\n]*\s*<dataset>[\r\n]*\s* should work, but I didn't test that.

0 Karma
Get Updates on the Splunk Community!

Detecting Remote Code Executions With the Splunk Threat Research Team

WATCH NOWRemote code execution (RCE) vulnerabilities pose a significant risk to organizations. If exploited, ...

Enter the Splunk Community Dashboard Challenge for Your Chance to Win!

The Splunk Community Dashboard Challenge is underway! This is your chance to showcase your skills in creating ...

.conf24 | Session Scheduler is Live!!

.conf24 is happening June 11 - 14 in Las Vegas, and we are thrilled to announce that the conference catalog ...