Dashboards & Visualizations

Parse XML file with repeating events and indexed as one event

i2sheri
Communicator

I am indexing an XML file which has the structure below with many other tags in, before, and after <a> and <b>

I need -> | eval diffInSec= end-start | table status start end diffInSec for each c below and for all events.

I'm indexing the whole file as a single event, but I want each status as one record from all events. Is it possible?

<?xml?>
<a>
    <b><c><c>

        <c>
            <status status="P" start="DT" end="DT"></status>
        </c>
        <c>
            <status status="P" start="DT" end="DT"></status>
        </c>
        <c>
            <status status="P" start="DT" end="DT"></status>
        </c>
        <c>
            <status status="P" start="DT" end="DT"></status>
        </c>

    </c></c></b>
</a>
0 Karma
1 Solution

somesoni2
Revered Legend

If you've KV_MODE=xml in props.conf (on search head), then Splunk should automatically extract fields (mostly multivalued) from your xml. Search for your data in Verbose mode and you should see fields (on left field sidebar) may be like this

a.b.c.status{@status}
a.b.c.status(@start}
a.b.c.status(@end}

Once you get these fields, use mvzip/mvexpand/rex to get these as separate row and get your table. May be like this (sample, adjust based on your fields)

your base search | table a.b.c.status* | rename a.b.c.status{@*} as * | eval temp=mvzip(status,mvzip(start,end,"#"),"#") | table temp | mvexpand temp | rex field=temp "(?<status>.*)#(?<start>.*)#(?<end>.*)#" | field - temp 

Above should give you a table with status, start and end. Then based on your timestamp format, your can calculate the different.

View solution in original post

somesoni2
Revered Legend

If you've KV_MODE=xml in props.conf (on search head), then Splunk should automatically extract fields (mostly multivalued) from your xml. Search for your data in Verbose mode and you should see fields (on left field sidebar) may be like this

a.b.c.status{@status}
a.b.c.status(@start}
a.b.c.status(@end}

Once you get these fields, use mvzip/mvexpand/rex to get these as separate row and get your table. May be like this (sample, adjust based on your fields)

your base search | table a.b.c.status* | rename a.b.c.status{@*} as * | eval temp=mvzip(status,mvzip(start,end,"#"),"#") | table temp | mvexpand temp | rex field=temp "(?<status>.*)#(?<start>.*)#(?<end>.*)#" | field - temp 

Above should give you a table with status, start and end. Then based on your timestamp format, your can calculate the different.

i2sheri
Communicator

which is better of the two below

host=h index=i sourcetype=xml
| eval start=strptime('a.b.c.status{@starttime}',"%Y%m%d %H:%M:%S.%3Q")
| eval end=strptime('a.b.c.status{@endtime}',"%Y%m%d %H:%M:%S.%3Q")
| eval temp=mvzip('a.b.c.status{@status}', mvzip(end, start, "#"), "#") 
| table temp | mvexpand temp | rex field=temp "(?<status>.*)#(?<end>.*)#(?<start>.*)" 
| eval diffInSec = end-start
| fields status diffInSec end start

OR

host=h index=i sourcetype=xml
| table a.b.c.status* | rename a.b.c.status{@*} as *
| eval temp=mvzip(status,mvzip(starttime,endtime,"#"),"#") 
| table temp | mvexpand temp | rex field=temp "(?<status>.*)#(?<starttime>.*)#(?<endtime>.*)"
| eval start=strptime(starttime,"%Y%m%d %H:%M:%S.%3Q")
| eval end=strptime(endtime,"%Y%m%d %H:%M:%S.%3Q")
| eval diffInSec=end-start | fields status diffInSec end start
0 Karma

i2sheri
Communicator

rex does not work. pasted one of the temp entries below.

PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS#20150921 06:00:15.379 20150921 06:00:15.395 20150921 06:00:15.408 20150921 06:00:15.415 20150921 06:00:15.433 20150921 06:00:15.445 20150921 06:00:15.452 20150921 06:00:15.457 20150921 06:00:15.485 20150921 06:00:15.494 20150921 06:00:15.499 20150921 06:00:15.513 20150921 06:00:15.523 20150921 06:00:15.556 20150921 06:00:15.570#20150921 06:00:15.385 20150921 06:00:15.405 20150921 06:00:15.413 20150921 06:00:15.419 20150921 06:00:15.437 20150921 06:00:15.450 20150921 06:00:15.455 20150921 06:00:15.473 20150921 06:00:15.492 20150921 06:00:15.497 20150921 06:00:15.509 20150921 06:00:15.520 20150921 06:00:15.527 20150921 06:00:15.568 20150921 06:00:15.572 
0 Karma

i2sheri
Communicator

I removed |xmlkv from base search and it works with two changes

 your base search | table a.b.c.status* | rename a.b.c.status{@*} as * | eval temp=mvzip(status,mvzip(start,end,"#"),"#") | table temp | mvexpand temp | rex field=temp "(?<status>.*)#(?<start>.*)#(?<end>.*)" | fields - temp 
0 Karma

richgalloway
SplunkTrust
SplunkTrust

It's probably not the best way, but you could do it with rex. That will give you three multi-valued fields. Combine them with mvzip then split the result into separate events and should have what you want.

... | rex max_match=0 "status=\"(?P<status>[^\"]*)\" start=\"(?P<start>[^\"]*)\" end=\"(?P<end>[^\"]*)\"" | eval fields = mvzip(mvzip(status,start),end) | mvexpand fields | ...
---
If this reply helps you, Karma would be appreciated.
0 Karma
Get Updates on the Splunk Community!

Exporting Splunk Apps

Join us on Monday, October 21 at 11 am PT | 2 pm ET!With the app export functionality, app developers and ...

Cisco Use Cases, ITSI Best Practices, and More New Articles from Splunk Lantern

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Build Your First SPL2 App!

Watch the recording now!.Do you want to SPL™, too? SPL2, Splunk's next-generation data search and preparation ...