Splunk Search

Transforms on multiple indexes of XML sources with nested, duplicate keys

humbertocastro
New Member

I would like to get a single report by combining data from 3 different data sources. However, I am running into a problem where in one of my sources there is a duplicate field name (pool_name) and I cannot figure out how to split it up. The following is my search

index=Masking_View OR
index=Device OR
index=Association_Info
| rename Masking_View.View_Info.stor_grpname as sg_name
| rename Masking_View.View_Info.Device.dev_name as dev_name
| rename Masking_View.View_Info.Totals.total_dev_cap_mb as total_dev_cap_mb
| rename Association_Info.sg_name as sg_name
| rename Association_Info.policy_name as policy_name
| rename Device.dev_name as dev_name
| rename Device.pool.pool_name as pool_name
| rename Device.pool.pool_alloc_percent as pool_alloc_percent
| rename Device.pool.alloc_tracks_mb as alloc_tracks_mb
| fields sg_name, dev_name, total_dev_cap_mb, policy_name, pool_name, pool_alloc_percent, alloc_tracks_mb
| transaction sg_name, dev_name
| search mail*
| stats avg(pool_alloc_percent) sum(alloc_tracks_mb) first(total_dev_cap_mb) by sg_name, policy_name, pool_name

My sources are XML logs that have entries with the following structure:

Masking_View.log

<Masking_View>
  <View_Info>
    <stor_grpname>mail_SG</stor_grpname>
    <Device>
      <dev_name>0B01</dev_name>
    </Device>
    <Totals>
      <total_dev_cap_mb>1024003</total_dev_cap_mb>
    </Totals>
  </View_Info>
</Masking_View>

Device.log

<Device>
    <dev_name>0B01</dev_name>
    <total_tracks_mb>1024003</total_tracks_mb>
    <written_tracks_mb>414798</written_tracks_mb>
    <total_tracks_gb>1000.0</total_tracks_gb>
    <written_tracks_gb>405.1</written_tracks_gb>
    <pool_written_percent>41</pool_written_percent>
    <pool>
        <pool_name>VP_FC_01</pool_name>
        <alloc_tracks>0</alloc_tracks>
        <alloc_tracks_mb>0</alloc_tracks_mb>
        <alloc_tracks_gb>0.0</alloc_tracks_gb>
        <pool_alloc_percent>0</pool_alloc_percent>
    </pool>
    <pool>
        <pool_name>VP_SATA_01</pool_name>
        <alloc_tracks>6636960</alloc_tracks>
        <alloc_tracks_mb>414810</alloc_tracks_mb>
        <alloc_tracks_gb>405.1</alloc_tracks_gb>
        <pool_alloc_percent>41</pool_alloc_percent>
    </pool>
</Device>

Association_Info.log

<Association_Info>
    <sg_name>mail_SG</sg_name>
    <policy_name>Tier2</policy_name>
</Association_Info>

My props.conf looks like:

[Masking_View]
TRUNCATE = 0
MAX_EVENTS = 1000000
SHOULD_LINEMERGE = true
KV_MODE = xml
BREAK_ONLY_BEFORE = ^\s*<Masking_View>
MUST_BREAK_AFTER = /Masking_View>

[Device]
TRUNCATE = 0
SHOULD_LINEMERGE = true
KV_MODE = xml
BREAK_ONLY_BEFORE = ^\s*<Device>
MUST_BREAK_AFTER = /Device>

[Association_Info]
TRUNCATE = 0
SHOULD_LINEMERGE = true
KV_MODE = xml
BREAK_ONLY_BEFORE = ^\s*<Association_Info>
MUST_BREAK_AFTER = /Association_Info>

I would like to have a result showing something like:

sg_name   policy_name   pool_name   avg(pool_alloc_percent) sum(alloc_tracks_mb)    first(total_dev_cap_mb)
mail_SG   Tier2         VP_FC_01    0                       0                       0
mail_SG   Tier2         VP_SATA_01  20.500000               414810                  1024003

Instead I get this:

sg_name   policy_name   pool_name   avg(pool_alloc_percent) sum(alloc_tracks_mb)    first(total_dev_cap_mb)
mail_SG   Tier2         VP_FC_01    20.500000               414810                  1024003
mail_SG   Tier2         VP_SATA_01  20.500000               414810                  1024003

Do you have any idea how to get the various pool_name sums and averages to group properly?

0 Karma

dart
Splunk Employee
Splunk Employee

Try the eval function mvzip and then mvexpand to get one event per pool.

0 Karma

humbertocastro
New Member

I could not get the mvzip to work quite right but I did solve this a different way... probably not the best method, but it gets me the results I need. If anyone is interested I was able to create the 3 sets of fields by doing the following.

First create a field that is semicolon delimited then use a regex to get the fields I want. The comment system is not allowing me to post the code. So I guess that is left as an exercise for the reader.

0 Karma
Take the 2021 Splunk Career Survey

Help us learn about how Splunk has
impacted your career by taking the 2021 Splunk Career Survey.

Earn $50 in Amazon cash!