I am writing my own search command and writing a python scripts for that. To start with, I am reading the results and writing to a log file. I see the same results are being logged twice. I also printed the os.getpid() and see that was different in both the calls.
I turned the streaming = true in commands.conf, I still see the script is being called twice but this time results were empty in one of the calls.
Please let me know what should I do to fix this problem?
import sys, os, gzip, csv, time, traceback
import splunk.Intersplunk
def logger(string):
log.write(string + "\n")
return 0
log = open(os.path.join(os.environ["SPLUNK_HOME"], 'var', 'log', 'splunk','my_results.log'), 'a')
results = splunk.Intersplunk.getOrganizedResults()
logger("called" + str(os.getpid()))
logger((str(results))
splunk.Intersplunk.outputResults(results)
Hm, if the results are taking a while you might get intermediate results. On stdout in line 7 there is "preview: 1" indicating the search has not yet completed.
Something like
sys.stdin = codecs.getreader('utf-8')(sys.stdin)
Check if line 7 is "preview:1"
if str(row).strip('[\']') ==
"preview:1":
exit()
Does this help with your problem?
Thank you @dominiquevocat. This fixed my problem, never realized the preview in the header, my code now has this (in case it helps anyone else) to prevent the script running twice:
stdin_wrapper = Reader(sys.stdin)
buf, settings = read_input(stdin_wrapper, has_header = True)
if settings['preview'] == '1' or settings['preview'] == 1:
sys.exit()
events = csv.DictReader(buf)
Hi, Is there a solution for this? I too am facing the same problem. I want to post the results via http, so use a custom command. But looks like the custom command is getting multiple times with EXECUTE (isgetinfo false). How do I prevent this? I want the custom command to be invoked only when the entire result set is available.
streaming = false should do the trick i think.
It doesn't fix this. Am facing similar issue.
If you want just the log file then the latest of the splunk calls will have all the search events, so write the log file in write mode instead of append mode. That ways data won't be duplicated and you will have all the events.
My issue is I need to invoke a shell script on another host when my python script is called, so this issue will cause the remote script to be executed twice as well.
Hi, if you want, have a look at TA-xls ( https://splunkbase.splunk.com/app/1832/ ) ... while the results are not complete splunk says something like preview: 1 in line 7 on stdout. i think the wrapper i have bolted together will help you some.
Setting streaming
to false
is the way to prevent the command script from being called multiple times. Additionally Splunk will call the script twice for each run if the option supports_getinfo
is enabled. You can either disable this option or check if the call is a GETINFO or EXECUTE call.
(isgetinfo, sys.argv) = splunk.Intersplunk.isGetInfo(sys.argv)
if isgetinfo:
splunk.Intersplunk.outputInfo(False, False, True, False, None)
else:
# ... output to file
I found that when I use dedup command before this command this is being called twice. Looks like dedup command is also non-streaming command (It's a guess). I read about require_preop
but still not sure how to use this. Do you have any idea how to use this property?
It is still being called twice. I was thinking the default values for streaming
and supports_getinfo
are false but anyway I set them to false explicitly in the commands.conf.
When I set supports_getinfo
to true in commands.conf, the script is being called 4 times.
This is how my commands.conf file looks like.
[updatestate]
filename = update_state.py
supports_getinfo = false
streaming = false
PS: I am not running the real time search on this if that matters.