Building for the Splunk Platform

How to parse Splunk XML response data in python

sunilpanda023
Path Finder

I am particularly interested in extracting the dispatchState (present in line 28) and few other interesting metrics

<s:key name="dispatchState">DONE</s:key>


<?xml version="1.0" encoding="UTF-8"?>
<!--This is to override browser formatting; see server.conf[httpServer] to disable. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .-->
<?xml-stylesheet type="text/xml" href="/static/atom.xsl"?>
<entry xmlns="http://www.w3.org/2005/Atom" xmlns:s="http://dev.splunk.com/ns/rest" xmlns:opensearch="http://a9.com/-/spec/opensearch/1.1/">
  <title>index=sample_idx | stats count</title>
  <id>https://hostname.com.au:8089/services/search/jobs/1528783903.136065</id>
  <updated>2018-06-12T16:14:02.544+10:00</updated>
  <link href="/services/search/jobs/1528783903.136065" rel="alternate"/>
  <published>2018-06-12T16:11:43.000+10:00</published>
  <link href="/services/search/jobs/1528783903.136065/search.log" rel="search.log"/>
  <link href="/services/search/jobs/1528783903.136065/events" rel="events"/>
  <link href="/services/search/jobs/1528783903.136065/results" rel="results"/>
  <link href="/services/search/jobs/1528783903.136065/results_preview" rel="results_preview"/>
  <link href="/services/search/jobs/1528783903.136065/timeline" rel="timeline"/>
  <link href="/services/search/jobs/1528783903.136065/summary" rel="summary"/>
  <link href="/services/search/jobs/1528783903.136065/control" rel="control"/>
  <author>
    <name>rest_poc</name>
  </author>
  <content type="text/xml">
    <s:dict>
      <s:key name="canSummarize">0</s:key>
      <s:key name="cursorTime">2038-01-19T14:14:07.000+11:00</s:key>
      <s:key name="defaultSaveTTL">604800</s:key>
      <s:key name="defaultTTL">300</s:key>
      <s:key name="delegate"></s:key>
      <s:key name="diskUsage">65536</s:key>
      <s:key name="dispatchState">DONE</s:key>
      <s:key name="doneProgress">1.00000</s:key>
      <s:key name="dropCount">0</s:key>
      <s:key name="earliestTime">1970-01-01T10:00:00.000+10:00</s:key>
      <s:key name="eventAvailableCount">0</s:key>
      <s:key name="eventCount">0</s:key>
      <s:key name="eventFieldCount">0</s:key>
      <s:key name="eventIsStreaming">1</s:key>
      <s:key name="eventIsTruncated">1</s:key>
      <s:key name="eventSearch"></s:key>
      <s:key name="eventSorting">desc</s:key>
      <s:key name="isBatchModeSearch">0</s:key>
      <s:key name="isDone">1</s:key>
      <s:key name="isEventsPreviewEnabled">0</s:key>
      <s:key name="isFailed">0</s:key>
      <s:key name="isFinalized">0</s:key>
      <s:key name="isPaused">0</s:key>
      <s:key name="isPreviewEnabled">0</s:key>
      <s:key name="isRealTimeSearch">0</s:key>
      <s:key name="isRemoteTimeline">0</s:key>
      <s:key name="isSaved">0</s:key>
      <s:key name="isSavedSearch">0</s:key>
      <s:key name="isTimeCursored">0</s:key>
      <s:key name="isZombie">0</s:key>
      <s:key name="keywords"></s:key>
      <s:key name="label"></s:key>
      <s:key name="normalizedSearch"></s:key>
      <s:key name="numPreviews">0</s:key>
      <s:key name="optimizedSearch">index=sample_idx | stats count</s:key>
      <s:key name="pid">9035</s:key>
      <s:key name="pid">9035</s:key>
      <s:key name="priority">5</s:key>
      <s:key name="provenance"></s:key>
      <s:key name="remoteSearch"></s:key>
      <s:key name="reportSearch">index=sample_idx | stats count</s:key>
      <s:key name="resultCount">5</s:key>
      <s:key name="resultIsStreaming">0</s:key>
      <s:key name="resultPreviewCount">5</s:key>
      <s:key name="runDuration">0.015</s:key>
      <s:key name="sampleRatio">1</s:key>
      <s:key name="sampleSeed">0</s:key>
      <s:key name="scanCount">0</s:key>
      <s:key name="searchCanBeEventType">0</s:key>
      <s:key name="searchTotalBucketsCount">0</s:key>
      <s:key name="searchTotalEliminatedBucketsCount">0</s:key>
      <s:key name="sid">1528783903.136065</s:key>
      <s:key name="statusBuckets">0</s:key>
      <s:key name="ttl">300</s:key>
      <s:key name="performance">
        <s:dict>
          <s:key name="command.head">
            <s:dict>
              <s:key name="duration_secs">0.001</s:key>
              <s:key name="invocations">1</s:key>
              <s:key name="input_count">35</s:key>
              <s:key name="output_count">5</s:key>
            </s:dict>
          </s:key>
          <s:key name="command.inputlookup">
            <s:dict>
              <s:key name="duration_secs">0.001</s:key>
              <s:key name="invocations">1</s:key>
              <s:key name="input_count">0</s:key>
              <s:key name="output_count">172</s:key>
            </s:dict>
          </s:key>
          <s:key name="command.stats">
            <s:dict>
              <s:key name="duration_secs">0.001</s:key>
              <s:key name="invocations">1</s:key>
              <s:key name="input_count">0</s:key>
              <s:key name="output_count">35</s:key>
            </s:dict>
          </s:key>
          <s:key name="dispatch.check_disk_usage">
            <s:dict>
              <s:key name="duration_secs">0.001</s:key>
              <s:key name="invocations">1</s:key>
            </s:dict>
          </s:key>
          <s:key name="dispatch.createdSearchResultInfrastructure">
            <s:dict>
              <s:key name="duration_secs">0.001</s:key>
              <s:key name="invocations">1</s:key>
            </s:dict>
          </s:key>
          <s:key name="dispatch.evaluate">
            <s:dict>
              <s:key name="duration_secs">0.001</s:key>
              <s:key name="invocations">1</s:key>
            </s:dict>
          </s:key>
          <s:key name="dispatch.evaluate.head">
            <s:dict>
              <s:key name="duration_secs">0.001</s:key>
              <s:key name="invocations">1</s:key>
            </s:dict>
          </s:key>
          <s:key name="dispatch.evaluate.inputlookup">
            <s:dict>
              <s:key name="duration_secs">0.001</s:key>
              <s:key name="invocations">1</s:key>
            </s:dict>
          </s:key>
          <s:key name="dispatch.evaluate.stats">
            <s:dict>
              <s:key name="duration_secs">0.001</s:key>
              <s:key name="invocations">1</s:key>
            </s:dict>
          </s:key>
          <s:key name="dispatch.optimize.FinalEval">
            <s:dict>
              <s:key name="duration_secs">0.001</s:key>
              <s:key name="invocations">1</s:key>
            </s:dict>
          </s:key>
          <s:key name="dispatch.optimize.matchReportAcceleration">
            <s:dict>
              <s:key name="duration_secs">0.004</s:key>
              <s:key name="invocations">1</s:key>
            </s:dict>
          </s:key>
          <s:key name="dispatch.optimize.optimization">
            <s:dict>
              <s:key name="duration_secs">0.006</s:key>
              <s:key name="invocations">1</s:key>
            </s:dict>
          </s:key>
          <s:key name="dispatch.optimize.reparse">
            <s:dict>
              <s:key name="duration_secs">0.001</s:key>
              <s:key name="invocations">1</s:key>
            </s:dict>
          </s:key>
          <s:key name="dispatch.optimize.toJson">
            <s:dict>
              <s:key name="duration_secs">0.001</s:key>
              <s:key name="invocations">1</s:key>
            </s:dict>
          </s:key>
          <s:key name="dispatch.optimize.toSpl">
            <s:dict>
              <s:key name="duration_secs">0.001</s:key>
              <s:key name="invocations">1</s:key>
            </s:dict>
          </s:key>
          <s:key name="dispatch.writeStatus">
            <s:dict>
              <s:key name="duration_secs">0.007</s:key>
              <s:key name="invocations">7</s:key>
            </s:dict>
          </s:key>
          <s:key name="startup.configuration">
            <s:dict>
              <s:key name="duration_secs">0.089</s:key>
              <s:key name="invocations">1</s:key>
            </s:dict>
          </s:key>
          <s:key name="startup.handoff">
            <s:dict>
              <s:key name="duration_secs">0.003</s:key>
              <s:key name="invocations">1</s:key>
            </s:dict>
          </s:key>
        </s:dict>
      </s:key>
      <s:key name="fieldMetadataStatic">
        <s:dict>
          <s:key name="Description">
            <s:dict>
              <s:key name="type">unknown</s:key>
              <s:key name="groupby_rank">0</s:key>
            </s:dict>
          </s:key>
        </s:dict>
      </s:key>
      <s:key name="fieldMetadataResults">
        <s:dict>
          <s:key name="Description">
            <s:dict>
              <s:key name="type">unknown</s:key>
              <s:key name="groupby_rank">0</s:key>
            </s:dict>
          </s:key>
        </s:dict>
      </s:key>
      <s:key name="messages">
        <s:dict/>
      </s:key>
      <s:key name="request">
        <s:dict>
          <s:key name="search">index=sample_idx | stats count</s:key>
        </s:dict>
      </s:key>
      <s:key name="runtime">
        <s:dict>
          <s:key name="auto_cancel">0</s:key>
          <s:key name="auto_pause">0</s:key>
        </s:dict>
      </s:key>
      <s:key name="eai:acl">
        <s:dict>
          <s:key name="perms">
            <s:dict>
              <s:key name="read">
                <s:list>
                  <s:item>rest_poc</s:item>
                </s:list>
              </s:key>
              <s:key name="write">
                <s:list>
                  <s:item>rest_poc</s:item>
                </s:list>
              </s:key>
            </s:dict>
          </s:key>
          <s:key name="owner">rest_poc</s:key>
          <s:key name="modifiable">1</s:key>
          <s:key name="sharing">global</s:key>
          <s:key name="app">search</s:key>
          <s:key name="can_write">1</s:key>
          <s:key name="ttl">300</s:key>
        </s:dict>
      </s:key>
      <s:key name="searchProviders">
        <s:list/>
      </s:key>
    </s:dict>
  </content>
</entry>
0 Karma
1 Solution

poete
Builder

OK, so it looks more like a python problem than a Splunk one.

So I created a file (from the content of you post) with the xml content.

The below code displays the value of the node with name=dispatchState. You will need to adapt it, by changing the parse call with parsreString, I think.

#!/usr/bin/python
# -*- coding: utf-8 -*-

import xml.dom.minidom

dom=xml.dom.minidom.parse("test.xml")

keys=dom.getElementsByTagName('s:key')

for n in keys:
    if n.getAttribute('name') == 'dispatchState':
        print n.childNodes[0].nodeValue

View solution in original post

manikyasandeepg
Explorer

Thanks for this post. I tried for my scenario and it works. I was looking for extracting disabled status from the alerts and I used parseString instead of parse and "response.text" is the POST response.

Example:

response=request.post(http://xxx.x.x.x:8089/servicesNS/nobody/search/saved/searches/$ALERT$/disable)

import xml.dom.minidom
dom=xml.dom.minidom.parseString(response.text)
keys=dom.getElementsByTagName('s:key')
for n in keys:
if n.getAttribute('name') == 'disabled':
   Status=print(n.childNodes[0].nodeValue)
0 Karma

poete
Builder

OK, so it looks more like a python problem than a Splunk one.

So I created a file (from the content of you post) with the xml content.

The below code displays the value of the node with name=dispatchState. You will need to adapt it, by changing the parse call with parsreString, I think.

#!/usr/bin/python
# -*- coding: utf-8 -*-

import xml.dom.minidom

dom=xml.dom.minidom.parse("test.xml")

keys=dom.getElementsByTagName('s:key')

for n in keys:
    if n.getAttribute('name') == 'dispatchState':
        print n.childNodes[0].nodeValue

poete
Builder

Hi,

are you getting data out of Splunk and rying to read it in python?

Or trying to get data in?

0 Karma

sunilpanda023
Path Finder

Yes, getting data out of Splunk and read it in python, and I have to wait till the dispatch state is DONE before I could get results of that specific job sid.
I am also interested in few other metrics to extract.

I am trying to parse the xml using xml.dom minidom and lxml.etree packages but not successful as there is some Atom.xsl styling is used.

Any help is appreciated.

Thanks,
Sunil Panda

0 Karma
Get Updates on the Splunk Community!

Splunk Training for All: Meet Aspiring Cybersecurity Analyst, Marc Alicea

Splunk Education believes in the value of training and certification in today’s rapidly-changing data-driven ...

The Splunk Success Framework: Your Guide to Successful Splunk Implementations

Splunk Lantern is a customer success center that provides advice from Splunk experts on valuable data ...

Investigate Security and Threat Detection with VirusTotal and Splunk Integration

As security threats and their complexities surge, security analysts deal with increased challenges and ...