Some background....
I want to have an alert which triggers when the results of two independent searches have data
Search 1 = a simple search which will be contained within the alert/saved-search
Search 2 = will be triggered with the results of search1 using custom searches written with python.
The script will perform the following
Results
When searching from the Splunk web console with the command
sourcetype="holdingRegisters" SPLUNK=StationStatusCoil | dedup station | stationstartcheck __EXECUTE__
By looking at the logfiles the script is run twice by splunk for some reason???
So to my questions
Why is it running twice when performing a search in Splunk?
Note the script is just a skeleton of what I plan to do but the main components are there. It works but just fills the original data with junk. I plan to fill it with the results of the second search.
I am using splunk 4.3 and my custom search is below.
import csv
import sys
import splunk.Intersplunk
import string
import datetime
import splunk.auth, splunk.search
import time
start = time.time()
# open logfile
f = open('/tmp/stationStartCheck.log', 'w+')
f.write(str(time.time() - start ) + ' - Starting\n')
f.write(str(time.time() - start ) + ' - argv length ' + str(len(sys.argv)) + '\n')
(isgetinfo, sys.argv) = splunk.Intersplunk.isGetInfo(sys.argv)
if isgetinfo:
splunk.Intersplunk.outputInfo(False, False, True, False, None, True)
# outputInfo automatically calls sys.exit()
# check existing data to determine which station we are dealing with
try:
f.write(str(time.time() - start ) + ' - Getting results from Splunk\n')
results = splunk.Intersplunk.readResults(None, None, True)
f.write(str(time.time() - start ) + ' - Success\n')
f.write(str(time.time() - start ) + ' - Size of resultset' + str(len(results)) + '\n')
except Exception, e:
splunk.Intersplunk.generateErrorResults("Unhandled exception: %s" % (e,))
# perform secondary search and output results if any
try:
f.write(str(time.time() - start ) + ' - Authenticating....\n')
key = splunk.auth.getSessionKey('admim','changeme')
f.write(str(time.time() - start ) + ' - Sending Search....\n')
my_job = splunk.search.dispatch('search sourcetype="holdingRegisters" SPLUNK=StationStatusCoil | dedup station', namespace='search', earliestTime='-1h', maxEvents=10)
while not my_job.isDone:
f.write(str(time.time() - start ) + ' - Waiting for results....\n')
time.sleep(1)
f.write(str(time.time() - start ) + ' - Results returned' + str(my_job.resultCount) + '\n')
for result in my_job.results:
f.write(str(result['station']) + '\n')
for i in range(len(results)):
f.write(str(time.time() - start ) + ' - Adding field to original result set\n')
results[i]['newField'] = 'uno'
splunk.Intersplunk.outputResults(results)
my_job.cancel()
except Exception, e:
splunk.Intersplunk.generateErrorResults("Unhandled exception: %s" % (e,))
# close logfile
f.close()
So when performing this search in the Splunk search bar
sourcetype="holdingRegisters" SPLUNK=StationStatusCoil | dedup station | stationstartcheck __EXECUTE__
The logfile has the following output
tail: /tmp/stationStartCheck.log: file truncated
Starting
argv length 2
0.00281405448914 - Getting results from Splunk
0.00357985496521 - Success
0.00358605384827 - Size of resultset4
0.00359296798706 - Authenticating....
0.01722407341 - Sending Search....
0.302284002304 - Waiting for results....
1.42954897881 - Results returned4
Station1
Station2
Station3
Station4
1.47054505348 - Adding field to original result set
1.47056603432 - Adding field to original result set
1.4705760479 - Adding field to original result set
1.47058486938 - Adding field to original result set
tail: /tmp/stationStartCheck.log: file truncated
Starting
argv length 2
0.000216960906982 - Getting results from Splunk
0.000913143157959 - Success
0.000919103622437 - Size of resultset4
0.00092601776123 - Authenticating....
0.0146579742432 - Sending Search....
0.293842077255 - Waiting for results....
1.30643010139 - Results returned4
Station1
Station2
Station3
Station4
1.32303500175 - Adding field to original result set
1.32305502892 - Adding field to original result set
1.32306408882 - Adding field to original result set
1.32308411598 - Adding field to original result set
Any ideas how to stop it running twice
very nice example, i can now understand and able to write some python scripts using splunk skd. thanks phoenixdigital
I have the same issue as posted here. What I was able to notice that if you do not use dedup then it works fine. Now I have a case where I need to use dedup command. Post here if you find the solution.
Thankyou for the suggestion but unfortuately that did not work. The script still runs twice.
However on closer inspection at a similar script I noticed that on the first pass it recieved 10 data points. Then on the second pass it recieved 17 data points.
So it appears Splunk it splitting up the results which is a shame as I would prefer my script recieve all data points at once for a given search.
Digging deeper again I can see that Splunk sends through one set of results then as it collects more sends the original set plus more results again. Below is a log of the _time and values of the two batches of data sent to the custom search.
Size of resultset 10
Record for _time, RRP 2012-03-07 11:00:00, 19.43165
Record for _time, RRP 2012-03-07 11:05:00, 19.72373
Record for _time, RRP 2012-03-07 11:10:00, 20.4553
Record for _time, RRP 2012-03-07 11:15:00, 20.44109
Record for _time, RRP 2012-03-07 11:20:00, 20.44642
Record for _time, RRP 2012-03-07 11:25:00, 20.14813
Record for _time, RRP 2012-03-07 11:30:00, 19.8667
Record for _time, RRP 2012-03-07 11:35:00, 19.60739
Record for _time, RRP 2012-03-07 11:40:00, 19.40553
Record for _time, RRP 2012-03-07 11:45:00, 19.48035
Size of resultset 17
Record for _time, RRP 2012-03-07 10:25:00, 17.72382
Record for _time, RRP 2012-03-07 10:30:00, 18.189
Record for _time, RRP 2012-03-07 10:35:00, 17.80982
Record for _time, RRP 2012-03-07 10:40:00, 18.44075
Record for _time, RRP 2012-03-07 10:45:00, 18.7983
Record for _time, RRP 2012-03-07 10:50:00, 19.32571
Record for _time, RRP 2012-03-07 10:55:00, 19.36478
Record for _time, RRP 2012-03-07 11:00:00, 19.43165
Record for _time, RRP 2012-03-07 11:05:00, 19.72373
Record for _time, RRP 2012-03-07 11:10:00, 20.4553
Record for _time, RRP 2012-03-07 11:15:00, 20.44109
Record for _time, RRP 2012-03-07 11:20:00, 20.44642
Record for _time, RRP 2012-03-07 11:25:00, 20.14813
Record for _time, RRP 2012-03-07 11:30:00, 19.8667
Record for _time, RRP 2012-03-07 11:35:00, 19.60739
Record for _time, RRP 2012-03-07 11:40:00, 19.40553
Record for _time, RRP 2012-03-07 11:45:00, 19.48035
And looking at the original script it recieves the same set of data on both runs.
sourcetype="holdingRegisters" SPLUNK=StationStatusCoil | dedup station | stationstartcheck __EXECUTE__
Hi phoenixdigital
you can set in commands.conf
supports_getinfo = false
streaming = false
if you need to have supports_getinfo
enabled, you can add the following to your script:
(isgetinfo, sys.argv) = splunk.Intersplunk.isGetInfo(sys.argv)
if isgetinfo:
splunk.Intersplunk.outputInfo(False, False, True, False, None)
else:
# do your thing
cheers
Hi, is there an equivalent solution for splunklib?