Dashboards & Visualizations

How to read and filter (complete) XML hierarchy?

bartzr
New Member

We are currently in the evaluation phase of splunk.
Dependent on the results of a feasibility study licenses will be purchased or not.

There is one important use case that we are currently not able to get to work.
We are having a hierarchical XML-File.

Simplified example:

<hierarchy1>
  <param1>1</param1>
  <param2>2</param2>
  <hierarchy2>
    <param3>3</param3>
    <param4>4</param4>
      <hierarchy3>
        <param5>5</param5>
        <param6>6</param6>
      </hierarchy3>
      <hierarchy3>
        <param5>7</param5>
        <param6>8</param6>
      </hierarchy3>
      <hierarchy3>
        <param5>9</param5>
        <param6>10</param6>
      </hierarchy3>
  </hierarchy2>
</hierarchy1>

We need to read the complete hierarchy and be able to select/filter all params:

Example query: give me all params (param1, param2, param3, param4, param5, param6) for param1=1, param5=7 OR param5=5.

Ideally the query is supported by indexes so that the magic does not have to happen during query run time.
Can this be achieved with splunk?
If yes, how?

Thanks in advance
Ronny

Tags (1)
0 Karma

gregbo
Communicator

Did you ever get an answer to this? I'm having a similar problem.

0 Karma

somesoni2
Revered Legend

You would need to setup the sourcetype definition to parse the data as XML. See this for more details

https://answers.splunk.com/answers/187195/how-to-add-and-parse-xml-data-in-splunk.html

For how it will look in Splunk upon ingestion, run this search from evaluation splunk instance.

| gentimes start=-1 | eval _raw=" <hierarchy1>
   <param1>1</param1>
   <param2>2</param2>
   <hierarchy2>
     <param3>3</param3>
     <param4>4</param4>
       <hierarchy3>
         <param5>5</param5>
         <param6>6</param6>
       </hierarchy3>
       <hierarchy3>
         <param5>7</param5>
         <param6>8</param6>
       </hierarchy3>
       <hierarchy3>
         <param5>9</param5>
         <param6>10</param6>
       </hierarchy3>
   </hierarchy2>
 </hierarchy1>" | table _raw | spath
0 Karma

bartzr
New Member

We followed up on your second proposal.
In order to better show what we need we extended the example slightly to:

| gentimes start=-1 | eval _raw=" <hierarchy1>
        <param1>1</param1>
        <param2>2</param2>
        <hierarchy2>
          <param3>3</param3>
          <param4>4</param4>
            <hierarchy3>
              <param5>5</param5>
              <param6>6</param6>
            </hierarchy3>
            <hierarchy3>
              <param5>7</param5>
              <param6>8</param6>
            </hierarchy3>
            <hierarchy3>
              <param5>9</param5>
              <param6>10</param6>
            </hierarchy3>
        </hierarchy2>
        <hierarchy2>
          <param3>a</param3>
          <param4>b</param4>
            <hierarchy3>
              <param5>c</param5>
              <param6>d</param6>
            </hierarchy3>
            <hierarchy3>
              <param5>e</param5>
              <param6>f</param6>
            </hierarchy3>
            <hierarchy3>
              <param5>g</param5>
              <param6>h</param6>
            </hierarchy3>
        </hierarchy2>
      </hierarchy1>" | table _raw | spath

When invoking this search string we get a table that contains all the single values.
However the relation between the hierarchies seems to be lost.

It seems not to be possible to link
hierarchy1.hierarchy2.hierarchy3.param5 with value 'g' to hierarchy1.hierarchy2.param3 with value 'a'.

This actually is the issue that we are struggling with.
Maybe just a small issue.

How would a query look like that asks for hierarchy1.hierarchy2.hierarchy3.param5 = 'g' where the result includes all metadata (the information provided in the hierarchy levels above) and exlucing elements with hierarchy1.hierarchy2.hierarchy3.param5 != 'g'

Thanks
Ronny

0 Karma

bartzr
New Member

Thanks for your really fast response!

The example that you referred to was working for me except for the following portion:
How would a search string look like that provides me all entries with comment="Happy birthday" without showing the data for comment="Good pic!"?

result without filter
photo_id, title , format , owner_id , owner , comment
"123", "Birthday", "jpg", "1111", "Jason", "Good pic!"
"123", "Birthday", "jpg", "1111", "Jason", "Happy birthday"

result with the filter applied as described above
photo_id, title , format , owner_id , owner , comment
"123", "Birthday", "jpg", "1111", "Jason", "Happy birthday"

--> one line less, all "meta-data" available

When I understand correctly by applying the sourcetype configuration as referred by your link I get one "photo-event" per photo tag.
Splunk then allows me to filter on event base.
So when I filter on comment = "Happy birthday" I will get all event data with that comment. Unfortunately this includes the comment "Good pic!" as well.
When I create smaller chunks, lets say one event per comment I have the comments separated, can properly filter on comment content but loose all meta data (like photo_id, owner and so on).

Did I make my point comprehensible?
If not let me know and I'll try to rephrase.

Thanks
Ronny

0 Karma
Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...