Getting Data In

xml-challenge: create and fill INDEPENENT Splunk-fields for repeating nested -Structures

Path Finder

Hi,

I am trying to find a solution to an easy sounding problem: I am having an xml input file, which contains billing data.
For each , I can have several contained tags and , that belong to the same Invoice.
Each of the again can have different contained.

Here is an anonymized example of one , the xml source file starts with other data (other tags), then lots of can follow :

//some other xml structures
/lots of <Invoices> ...

<Invoice>
…
<someothertags and infos>
...
  <ItemsInfo>
  <Item id="bla1">
      <AccountID>acc1</AccountID>
      <AccountName>megasecret</AccountName>
      <AccountType>SUBSCRIPTION</AccountType>
      <Type>/item/cxn/subscription/moving_out</Type>
      <Name>Subscription moving-out fees</Name>
      <SessionCount>8</SessionCount>
      <NetAmount>0.00000</NetAmount>
    </Item>
    <Item id="bla2">
      <AccountID>acc2</AccountID>
      <AccountName>topsecret</AccountName>
      <AccountType>SUBSCRIPTION</AccountType>
      <Type>/item/cxn/subscription/moving_in</Type>
      <Name>Subscription moving-in fees</Name>
      <SessionCount>8</SessionCount>
      <NetAmount>0.00000</NetAmount>
    </Item>
    <Item id="bla2">
      <AccountID>acc31</AccountID>
      <AccountName>extremelysecret</AccountName>
      <AccountType>CONTRACT</AccountType>
      <Type>/item/cxn/services/cycle</Type>
      <Name>anotherName</Name>
      <SessionCount/>
      <NetAmount>150.00000</NetAmount>
      <SubItem>
        <Name>someSLA</Name>
        <Quantity>1</Quantity>
        <NetAmount>150.00000</NetAmount>
        <GLID>secret1</GLID>
        <Unit>1</Unit>
        <UnitPrice>150.00000</UnitPrice>
        <CycleStartTS>1556668800</CycleStartTS>
        <CycleEndTS>1559347199</CycleEndTS>
      </SubItem>
    </Item>
    <Item id="bla3">
      <AccountID>acc44</AccountID>
      <AccountName>anothersecret</AccountName>
      <AccountType>CONTRACT</AccountType>
      <Type>/item/cxn/apn/cycle</Type>
      <Name>APN account cycle fees censored</Name>
      <SessionCount/>
      <NetAmount>150.00000</NetAmount>
      <SubItem>
        <Name>topsecretwhatever</Name>
        <Quantity>1</Quantity>
        <NetAmount>150.00000</NetAmount>
        <GLID>secretID1</GLID>
        <Unit>1</Unit>
        <UnitPrice>150.00000</UnitPrice>
        <CycleStartTS>1556668800</CycleStartTS>
        <CycleEndTS>1559347199</CycleEndTS>
      </SubItem>
    </Item>
  </ItemsInfo>

…
</Invoice>

Now, for each of the .. events, I want to create an event that then contains all the infos in the tags contained inside the Invoice event. I achieved this using:

[ xml-breakbefore-Invoice ]
BREAK_ONLY_BEFORE=<Invoice>
CHARSET=UTF-8
DATETIME_CONFIG ==CURRENT
KV_MODE=xml
MAX_EVENTS =99999
NO_BINARY_CHECK=true
SHOULD_LINEMERGE=true
TRUNCATE=0
category=Custom
description=xml-breakbefore-Invoice
disabled=false
pulldown_type=true

Now, the challenge is: Splunk seems to simply concatenate subtag fields values into single fields, so for , I am getting the attached result in Splunk: Seems it is just inserting spaces btw. the values found in the items/subitems -> fields.

But I want to be actually able to have them in single fields, e.g. by "item", bc. they belong to different items and there are many other subtags from that should not be "merged" together. Please note: the can "show up" even at different levels, e.g. within and - they should not get mixed !

E.g, for Netamount example above, I want to have s.th. like (note: nbr of items/subitems can vary btw. each Invoice, when an invoice has less than max, fields can be empty):

Invoice.ItemsInfo.Item1.NetAmount = 0
Invoice.ItemsInfo.Item2.NetAmount = 0
Invoice.ItemsInfo.Item3.NetAmount = 150
Invoice.ItemsInfo.Item4.NetAmount = 150

Is there a simple way (e.g. I do not want to have crazy regex/evals that work on the "intermediate" results above) to achieve this by adjusting the configuration [ xml-breakbefore-Invoice ] above to have the fields I want? Or is this too complicated within the xml-import- config and some (ugly) .xslt preprocessing etc. would need to be done "outside" Splunk?

Best Regards
Florian

alt text

0 Karma
Don’t Miss Global Splunk
User Groups Week!

Free LIVE events worldwide 2/8-2/12
Connect, learn, and collect rad prizes
and swag!