All Apps and Add-ons

How to parse ShopperTrak XML payload obtained via the REST API Modular Input?

Yorokobi
SplunkTrust
SplunkTrust

Hey folks,

Any suggestions for how to parse the XML payload from ShopperTrak's REST endpoint obtained via the REST API Modular Input?

Link to full payload: https://gist.github.com/yorokobi/2b8f1004be5b6142eaf05a1670bb1b8c (I tried enabling soft wrapping, but it didn't take.)

Pretty (and smaller) sample (use the gist for your own testing):

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<sites>
    <site storeID="1319">
        <traffic exits="13" enters="25" startTime="201608251030"/>
        <traffic exits="15" enters="13" startTime="201608251045"/>
        <traffic exits="16" enters="11" startTime="201608251100"/>
    </site>
    <site storeID="1455">
        <traffic exits="14" enters="26" startTime="201608251030"/>
        <traffic exits="30" enters="21" startTime="201608251045"/>
        <traffic exits="19" enters="17" startTime="201608251100"/>
    </site>
    <site storeID="2161">
        <traffic exits="32" enters="29" startTime="201608251330"/>
        <traffic exits="33" enters="35" startTime="201608251345"/>
        <traffic code="01" exits="" enters="" startTime="201608251400"/>
    </site>
    <site storeID="0486">
        <traffic exits="16" enters="17" startTime="201608251130"/>
        <traffic exits="13" enters="17" startTime="201608251145"/>
        <traffic exits="30" enters="21" startTime="201608251200"/>
    </site>
    <site storeID="2503">
        <traffic exits="3" enters="2" startTime="201608251230"/>
        <traffic exits="1" enters="4" startTime="201608251245"/>
        <traffic exits="4" enters="0" startTime="201608251300"/>
    </site>
    <site storeID="2162">
        <traffic exits="14" enters="18" startTime="201608251330"/>
        <traffic exits="19" enters="22" startTime="201608251345"/>
        <traffic exits="18" enters="12" startTime="201608251400"/>
    </site>
    <site storeID="2561">
        <traffic exits="5" enters="5" startTime="201608251330"/>
        <traffic exits="8" enters="8" startTime="201608251345"/>
        <traffic exits="3" enters="3" startTime="201608251400"/>
    </site>
    <site storeID="1647">
        <traffic exits="16" enters="12" startTime="201608251230"/>
        <traffic exits="8" enters="9" startTime="201608251245"/>
        <traffic code="02" exits="" enters="" startTime="201608251300"/>
    </site>
    <site storeID="0436">
        <traffic exits="20" enters="11" startTime="201608251230"/>
        <traffic exits="21" enters="20" startTime="201608251245"/>
        <traffic exits="22" enters="23" startTime="201608251300"/>
    </site>
    <site storeID="1751">
        <traffic exits="16" enters="9" startTime="201608251330"/>
        <traffic exits="15" enters="5" startTime="201608251345"/>
        <traffic exits="7" enters="12" startTime="201608251400"/>
    </site>
    <site storeID="1438">
        <traffic exits="18" enters="13" startTime="201608251330"/>
        <traffic exits="14" enters="11" startTime="201608251345"/>
        <traffic exits="12" enters="10" startTime="201608251400"/>
    </site>
    <site storeID="2029">
        <traffic exits="15" enters="19" startTime="201608251030"/>
        <traffic exits="23" enters="15" startTime="201608251045"/>
        <traffic exits="16" enters="18" startTime="201608251100"/>
    </site>
    <site storeID="0471">
        <traffic exits="19" enters="16" startTime="201608251230"/>
        <traffic exits="25" enters="15" startTime="201608251245"/>
        <traffic code="01" exits="" enters="" startTime="201608251300"/>
    </site>
</xml>

I have played with the following props.conf entries with pretty much zero success (not necessarily all at the same time--I haven't kept a running record).

SHOULD_LINEMERGE=false
BREAK_ONLY_BEFORE=<site\s
KV_MODE=xml
LINE_BREAKER=(<\/site>)

My attempts at using |xpath or |rex max_match=0 have likewise yielded less than stellar results.

What I need is a method of correctly tying storeID with exits, enters, startTime, and code when it shows up.

Thanks in advance. 😄

0 Karma

s2_splunk
Splunk Employee
Splunk Employee

Try this:

[shooperTrak]
KV_MODE = xml
MAX_TIMESTAMP_LOOKAHEAD = 128
NO_BINARY_CHECK = true
SHOULD_LINEMERGE = true
TIME_FORMAT = %Y%m%d%H%M
TIME_PREFIX = \sstartTime=\"
disabled = false
BREAK_ONLY_BEFORE = <site storeID

This should give you one event for each "site" element. You can configure props/transforms to get rid of the unneeded lines (XML declaration and "sites" envelope) if you want to.

0 Karma

Yorokobi
SplunkTrust
SplunkTrust

That breaks events by storeID nicely but now I need a way to get enters/exits/code/startTime by storeID.

New event:

<site storeID="0310"><traffic exits="3" enters="7" startTime="201608250700"/><traffic exits="4" enters="6" startTime="201608250715"/><traffic exits="4" enters="4" startTime="201608250730"/><traffic exits="3" enters="6" startTime="201608250745"/>

I tried again with |xpath "//site/traffic/@enters/" outfield=enters but wasn't able to get an 'enters' field and | stats first(site.traffic.@enters) BY site.@storeID site.traffic.@startTime produces one result instead of four.

Ideally, the end result should look like:

startTime     |  storeID  |  enters  |  exits  |  code  |\
---------------------------------------------------------\
201608250700  |     0133  |       7  |      3  |        |\
201608250715  |     0133  |       6  |      4  |        |\
201608250730  |     0133  |       6  |      3  |        |\
0 Karma

s2_splunk
Splunk Employee
Splunk Employee

UPDATE: This should work for you:

yoursearch | stats list(site.traffic{@startTime}) as Time, list(site.traffic{@enters}) as Enters, list(site.traffic{@exits}) as Exits, list(site.traffic{@code}) as Code by site{@storeID} | rename "site{@storeID}" as storeID | mvexpand Time | mvexpand Enters | mvexpand Exits | table Time storeID Enters Exits Code

I didn't have much time to spend on this, but you will hopefully get the idea from this:

<yoursearch> | stats list(site.traffic{@enters}) as Enters, list(site.traffic{@exits}) as Exits by site{@storeID}

It needs to be refined to add your startTime and you can convert the multivalue field into individual table entries, if need be.

I hope this gets you started.

0 Karma

Yorokobi
SplunkTrust
SplunkTrust

I suspect what would work best is a custom response handler for the REST TA.

0 Karma

s2_splunk
Splunk Employee
Splunk Employee

That's an alternative, if what I provided doesn't work for you (it worked for me with your sample data).

0 Karma
.conf21 CFS Extended through 5/20!

Don't miss your chance
to share your Splunk
wisdom in-person or
virtually at .conf21!

Call for Speakers has
been extended through
Thursday, 5/20!