Dashboards & Visualizations

python SDK parse xml output

EricPartington
Communicator

I am attempting to parse the results from a search using the python sdk.
I can successfully run a search against splunk and return the results via xml output.

#!c:/Python26/python.exe -u
import splunk.client as client
import sys, datetime
from pprint import pprint
HOST = "1.2.3.4"
PORT = 8090
USERNAME = "admin"
PASSWORD = "aaaaaaaa"

service = client.connect(
   host=HOST,
   port=PORT,
   username=USERNAME,
   password=PASSWORD)
oldtime = datetime.datetime.now()
search = 'search  index="firewall_rules" table=rule disable=no | dedup name,host |table    host,name,rulegroup,appcode,_time | head 2' 
job = service.jobs.create(search, exec_mode="blocking")
#job = service.jobs.create(search)

while True: 
  stats = job.read('isDone') 
  if stats['isDone'] == str(1): 
    break 
content = str(job.results(output_mode='xml'))
print "Output: %s" % content

Output: <?xml version='1.0' encoding='UTF-8'?>
<results preview='0'>
<meta>
<fieldOrder>
<field>host</field>
<field>name</field>
<field>rulegroup</field>
<field>appcode</field>
<field>_time</field>
</fieldOrder>
</meta>
    <result offset='0'>
            <field k='host'>
                    <value><text>FW1</text></value>
            </field>
            <field k='name'>
                    <value><text>&apos;Deny All&apos;</text></value>
            </field>
            <field k='rulegroup'>
                    <value><text>&apos;&apos;</text></value>
            </field>
            <field k='_time'>
                    <value><text>2011-12-03T03:24:02.000-05:00</text></value>
            </field>
    </result>
    <result offset='1'>
            <field k='host'>
                    <value><text>FW2</text></value>
            </field>
            <field k='name'>
                    <value><text>WEB1</text></value>
            </field>
            <field k='rulegroup'>
                    <value><text>WEB1</text></value>
            </field>
            <field k='appcode'>
                    <value><text>ABC123</text></value>
            </field>
            <field k='_time'>
                    <value><text>2011-12-03T03:24:02.000-05:00</text></value>
            </field>
    </result>
</results>
Elapsed Time: 0:00:04.500000

I have attempted to parse the xml output into a structure that I can use to insert into another system (mysql). I have tried xmlminidom and etree without success, is there an easier way to access the values with python?

Tags (4)
0 Karma
1 Solution

EricPartington
Communicator

Here is my latest attempt at using the sdk, python.. SplunkDev helped out a great deal with this.

#!c:/Python26/python.exe -u
import splunk.client as client
import splunk.results as results
import sys, datetime
from pprint import pprint

HOST = "localhost"
PORT = 8090
USERNAME = "admin"
PASSWORD = "abc123"

service = client.connect(
    host=HOST,
    port=PORT,
    username=USERNAME,
    password=PASSWORD)
# ----------------------------------------
oldtime = datetime.datetime.now()

search = 'search  index="sidewinder_rules" table=rule disable=no earliest=-8d| dedup name,cluster |table cluster,name,rulegroup,sourcetype,appcode,_time' 

job = service.jobs.create(search, exec_mode="blocking", max_count=5000)
job_results = job.results(count=0, output_mode="xml")
reader = results.ResultsReader(job_results)

num_results = 0;
for kind, result in reader:
    if kind == results.RESULT:
        host = result.get("cluster", 0)
        rulegroup = result.get("rulegroup", 0)
        name = result.get("name", 0)
        sourcetype = result.get("sourcetype", 0)
        appcode = result.get("appcode", 0)
        pprint(host)

newtime = datetime.datetime.now()
print "Elapsed Time: %s" % (newtime - oldtime)

Still have to make some improvements as suggested by splunkDev but it functions for my needs.
It also gets more than the default 100 rows when returning results (count=0) and sets a limit of 5000 rows.

View solution in original post

to914868
New Member

You can change the output mode to csv which will parse it for you.

import splunklib.client as client
import splunklib.results as results
import csv
service = client.connect(
 host=HOST,
 port=PORT,
 username=un,
 password=pwd
query= """search {+enter your query here}​"""

results_kwargs = {
 "earliest_time": "-30min",
# or "earliest_time": datetime.datetime(2015, 6, 29).isoformat()
 "latest_time": "now",
 "search_mode": "normal",
 "output_mode": "csv"
}
oneshotsearch_results = service.jobs.oneshot(query, **results_kwargs)
f=open('myresults.csv', 'w')
f.write(oneshotsearch_results.read())
f.close()​
0 Karma

EricPartington
Communicator

Here is my latest attempt at using the sdk, python.. SplunkDev helped out a great deal with this.

#!c:/Python26/python.exe -u
import splunk.client as client
import splunk.results as results
import sys, datetime
from pprint import pprint

HOST = "localhost"
PORT = 8090
USERNAME = "admin"
PASSWORD = "abc123"

service = client.connect(
    host=HOST,
    port=PORT,
    username=USERNAME,
    password=PASSWORD)
# ----------------------------------------
oldtime = datetime.datetime.now()

search = 'search  index="sidewinder_rules" table=rule disable=no earliest=-8d| dedup name,cluster |table cluster,name,rulegroup,sourcetype,appcode,_time' 

job = service.jobs.create(search, exec_mode="blocking", max_count=5000)
job_results = job.results(count=0, output_mode="xml")
reader = results.ResultsReader(job_results)

num_results = 0;
for kind, result in reader:
    if kind == results.RESULT:
        host = result.get("cluster", 0)
        rulegroup = result.get("rulegroup", 0)
        name = result.get("name", 0)
        sourcetype = result.get("sourcetype", 0)
        appcode = result.get("appcode", 0)
        pprint(host)

newtime = datetime.datetime.now()
print "Elapsed Time: %s" % (newtime - oldtime)

Still have to make some improvements as suggested by splunkDev but it functions for my needs.
It also gets more than the default 100 rows when returning results (count=0) and sets a limit of 5000 rows.

rroberts
Splunk Employee
Splunk Employee

Great! http://dev.splunk.com is cool.

0 Karma

rroberts
Splunk Employee
Splunk Employee

There is an example here:
http://wiki.splunk.com/Dev:PySearch

You might find useful using xmlDoc

Get Updates on the Splunk Community!

What's New in Splunk Enterprise 9.4: Features to Power Your Digital Resilience

Hey Splunky People! We are excited to share the latest updates in Splunk Enterprise 9.4. In this release we ...

Take Your Breath Away with Splunk Risk-Based Alerting (RBA)

WATCH NOW!The Splunk Guide to Risk-Based Alerting is here to empower your SOC like never before. Join Haylee ...

SignalFlow: What? Why? How?

What is SignalFlow? Splunk Observability Cloud’s analytics engine, SignalFlow, opens up a world of in-depth ...