I have a script that runs in an app on my forwarders every 12 hours. Or at least that's what it was doing until it abruptly stopped working on all forwarders on September 30th.
The script is deployed using the deployment server, so all the forwarders have identical configurations.
The inputs.conf file has five scripts configured and all work except for this one. Here is its inputs.conf entry:
[script://splunk/etc/apps/all/bin/chknmon.sh] interval = 43200 # Run every 12 hours sourcetype = nmonchk source = script://./bin/chknmon.sh
The path to the chknmon.sh script is correct:
[/splunk/etc/apps/all/default]# ls -las /splunk/etc/apps/all/bin/chknmon.sh 4 -rwxr-xr-- 1 xyz abc 96 Oct 04 14:13 /splunk/etc/apps/all/bin/chknmon.sh
Splunk runs as root, and the script is owned by root.
Running the script manually produces correct output.
Other scripts in the same app (and same paths/permissions) run just fine. This one was no exception until it abruptly stopped working a few days ago. Admittedly, I've been toying around with Splunk a bit, but I have no idea what I could have done that could have affected just this one script.
I've tried bouncing both the indexer and the forwarders, but that didn't help.
I'm running 4.1.4 on the forwarders and 4.1.5 on the indexer.
Did it stop on Sept 30, or after Sept 30th? I'm just thinking that you could have a timestamp parsing / configuration issue.
There are two things that could cause a date passing issues starting Oct 1st. First, this is the first time in 2010 that the month is 2 digits (depending on whether your data format uses "9" or "09" for the month of September). Anything that assumed a single digit could have issues with this (for example, if you use splunk's
punct pattern anywhere in a search or eventtype). Also, Oct 1st, is the first time time the month is the same as the 2 digit year; so if your timestamps format is in any way ambiguous, and splunk is guessing about your timeformat, then this could cause some confusion. You may want to do a search across "All Time" and see if your events were simply given the wrong timestamp.
The bottom line: If your
props.conf entry for
[nmonchk] doesn't have an explicit
TIME_FORMAT entry, then I would suggest adding one.
Stopped at 09/30/2010 at 12:01 AM.
I took your suggestion and did an "all time" search... and I'm finding entries from 02/10/10... which is impossible because I didn't have this script running in February. Nice catch.
So I guess my question now is: what do I do with TIME_FORMAT? I just want the index time to be the time it invokes the script...
Actually I think I know where to go from here. The TIME_PREFIX regular expression is a bit tricky for this one, but I'll get it with trial and error.
Actually, if you don't want splunk to interpret any date times, set
DATETIME_CONFIG=CURRENT for your sourcetype in
props.conf. If you need help with a
TIME_FORMAT value, simply post a couple example timestamps and someone will give you a hand; based on your
02/10/10 example, I would guess you want
TIME_FORMAT = %d/%m/%y %H:%M:%S, but that's just a guess. You probably don't need to mess with
TIME_PREFIX, unless you have multiple timestamps in your event. If you set
DATETIME_CONFIG=CURRENT then you don't need either of the
Thanks! I managed to get the TIMEFORMAT right, but now that I read your comment I think I'm better off with DATETIMECONFIG=CURRENT.