My background. . . (Heavy Unix, Shell, numerous programming languages. But new to Python and Splunk.)
The intent of this script IS to archive a csv file into a separate directory with a date/time stamp for retention.
Problem is that splunk seems to run twice. First it runs BEFORE "outputcsv" has even started creating the output csv file. Then again, after the file has been created. I can live with it in this script but for future python scripts, This is a problem. I need to understand why my script gets called twice in the following search string.
index=summary | outputcsv myfile | archcsv -c myfile -a temp # Should only run one time at end.
My python search script will look for "myfile.csv" in the /apps/splunk/var/run/splunk and move it to the ../temp folder.
IF there happens to be a myfile.csv in the .../var/run/splunk when the search string STARTS, it will move it FIRST, then the script will be called again when the new myfile.csv has been created.
I know that splunk is NOT unix, but I feel that the "pipe" command should NOT call the archcsv.py script until AFTER outputcsv as finished creating its myfile.csv file.
local commands.conf entry
[pydebug]
type = python
filename = pydebug.py
streaming = false
retainsevents = true
UNIX Directory info with Comments:
[splunk]$ pwd
/apps/links/temp
[splunk]$ ls -ltr
[splunk]$ ls -altr /apps/splunk/var/run/splunk/csvstuff*
-rw------- 1 splunk users 12734095 Aug 4 13:08 /apps/splunk/var/run/splunk/csvstuff.csv
[splunk]$ # Now I will run the search, outputcsv and archive utility.
[splunk]$ # For some reason, it will copy the Existing csvstuff.csv and then the new one.
[splunk]$ pwd
/apps/links/temp
[splunk]$ ls -altr
total 22596
drwxr-xr-x 3 splunk users 4096 Jul 31 15:58 ..
-rw-r--r-- 1 splunk users 12734095 Aug 4 13:08 csvstuff_20140804131017.csv
-rw-r--r-- 1 splunk users 10392108 Aug 4 13:10 csvstuff_20140804131021.csv
drwxr-xr-x 2 splunk users 4096 Aug 4 13:10 .
Python script
#!/usr/bin/python
import sys, getopt, os
import splunk.Intersplunk
results,dummyresults,settings = splunk.Intersplunk.getOrganizedResults()
def main(argv):
line = ''
aarg=0
carg=0
archfold = 'subdir'
csvfile = 'default.csv'
options, remainder = getopt.getopt(sys.argv[1:], 'c:a:', ['csvfile=',
'archfold='])
for opt, arg in options:
if opt in ('-c', '--csvfile'):
carg=1
csvfile = arg
elif opt in ('-a', '--archfold'):
aarg=1
archfold = arg
sdir='/apps/splunk/var/run/splunk/'
adir='/apps/links/' + archfold + '/'
sfile=sdir + csvfile + '.csv'
afile=adir + csvfile + '_ date +"%Y%m%d%H%M%S" .csv'
if carg == 0 or aarg == 0:
sys.exit(1)
move='mv ' + sfile + ' ' + afile
line='if [ -e ' + sfile + ' ]; then ' + move + '; fi'
os.system(line)
line='chmod 644 ' + afile
os.system(line)
newresults = []
oldresult = None
for result in results:
if result != oldresult:
newresults.append(result)
oldresult = result
splunk.Intersplunk.outputResults(newresults)
if name == "main":
main(sys.argv[1:])
[splunk]$ # now, notice the first file above is from BEFORE I ran the search command
... View more