Splunk response time is quite slow when I use the lookup script presented below. The response time of the web service we use in the script is quite fast most of the times under 1ms per request. what should I look to improve the performance in splunk?
This is what job inspector says "dispatch.preview" takes most of the time:
Job inspector Results:
Duration (seconds) Component Invocations Input count Output count
0.003 command.addinfo 3 9,352 9,352
0.003 command.fields 3 9,352 9,352
0.005 command.lookup 1 3,240 3,240
0.07 command.prestats 3 9,352 3,240
0.327 command.remotetl 3 9,352 9,352
0.899 command.search 3 - 9,352
0.569 command.search.kv 1 - -
0.243 command.search.typer 3 9,352 9,352
0.067 command.search.rawdata 1 - -
0.009 command.search.calcfields 1 9,352 9,352
0.006 command.search.filter 1 - -
0.006 command.search.index 3 - -
0.002 command.search.tags 3 9,352 9,352
0.001 command.search.fieldalias 1 9,352 9,352
0.001 command.search.lookups 1 9,352 9,352
0.136 command.stats 1 3,240 3,240
0.048 dispatch.createProviderQueue 1 - -
0.169 dispatch.evaluate 1 - -
0.091 dispatch.evaluate.lookup 1 - -
0.077 dispatch.evaluate.search 1 - -
0.001 dispatch.evaluate.stats 1 - -
1.293 dispatch.fetch 4 - -
**461.84 dispatch.preview 1 - -**
0.013 dispatch.process_remote_timeline 1 110,422 9,352
1.292 dispatch.stream.local 3 - -
0.022 dispatch.timeline 4 - -
Splunk Query:
earliest=-1d@d latest=@d index="si_usage" |stats sum(count) by entity| lookup entityid_lookup entity as entity OUTPUT entity_title as entity_title
Script:
import urllib2, sys,csv,time,logging
from time import clock
from lxml import etree
from time import strftime as date
def lookup(entity):
try:
url = "http://fqdn/rex/select/?q="+entity+"&version=2.2&start=0&rows=10&indent=on&fl=title&application=agent.007"
f = urllib2.urlopen(url)
doc = etree.XML(f.read())
entity_title = doc.xpath("//str[@name='title']/text()")
if not entity_title:
entity_title = "Entity Title Not Found"
return entity_title
except:
return []
def main():
logging.basicConfig(filename='lookup.log', level=logging.INFO, format='%(asctime)s %(message)s')
start_time = time.time()
logging.info('Script started')
entity = sys.argv[1]
entity_title= sys.argv[2]
r = csv.reader(sys.stdin)
w = None
first = True
for line in r:
if first:
header = line
if entity not in header or entity_title not in header:
print "entity and entity_title fields must exist in CSV data"
sys.exit(0)
csv.writer(sys.stdout).writerow(header)
w = csv.DictWriter(sys.stdout, header)
first = False
continue
result = {}
i = 0
while i < len(header):
if i < len(line):
result[header[i]] = line[i]
else:
result[header[i]] = ''
i += 1
if len(result[entity]) and len(result[entity_title]):
w.writerow(result)
elif len(result[entity]):
resulted = lookup(result[entity])
result[entity_title]=resulted
w.writerow(result)
end_time = time.time() - start_time
logging.info('Script finished in %s seconds', end_time)
if __name__ == '__main__':
main()
... View more