Re: Data Input Script Suddenly Stopped Running

drautb · ‎07-11-2013

Hey all,

The Splunk instance that I work with has several data input scripts. (~30) One of them is scheduled to run hourly, it's cron string looks like this: "0 * * * *" It was working great, but it abruptly stopped running for some reason. The last time it ran, (as determined by the timestamp on it's output files) was June 30th at 11:00pm. I restarted splunk, and it started running again, but I still haven't been able to determine what caused it to stop in the first place.

Because of the timing, I thought it might be an error in my cron string, but everything I have found online says that the string is correct. Has anyone else run into this before? Scripts that abruptly stop running?

bmas10 · ‎05-13-2016

I am seeing the same issue. It will just randomly get removed from the schedule. Debug refresh will cause it to get added to the schedule again, but it will not start running again until I disable and re-enable. I have checked the _internal and opened up the splunkd.log locally. No errors. I have performed the basic troubleshooting listed by grijhwani and no luck.

grijhwani · ‎07-11-2013

Do you understand the script, or are you simply relying on a legacy job instated by someone else? You give no indication as to what the script is, or your underlying platform, so it is hard to answer the question. A number of generic possibilities:

Have you tried running the script manually? Does it even run any more?
Has someone screwed your source permissions (or indeed the run permissions of the script) in such a fashion that the two are now incompatible?
Has the data the script works from become corrupted, or too large for whatever tool you are using to manage?
Have there been any updates to the system? Perhaps an update of some tool critical to the success of the script has been broken by the update.
Have you checked for disk space problems wherever it is that the script places any intermediate or output files?
Have you checkpointed the script? If it is running on unix you could try "touch"ing semaphore files to track progress in the background.

In short - have you done any debugging before assuming it is the scheduling which has broken?

lguinn2 · ‎07-11-2013

You could find out more by looking in the Splunk logs. Log into Splunk as a admin, and run this search

index=_internal nameofyourscript

You should be able to see what happened each time that Splunk attempted to run your script.
You can find this same information in $SPLUNK_HOME/var/log/splunk/splunkd.log*

grijhwani · ‎07-11-2013

The cron notation is just fine - the zeroth minute of every hour. And besides, if you didn't change it why would it be wrong now when it wasn't before?

drautb · ‎07-11-2013

I just fixed the cron string in my post, I didn't realize the website removed some of the asterisks. Is yours correct? I'm not seeing how * * /1 * would equate to an hourly job.

linu1988 · ‎07-11-2013

The cron schedule should be like this:
* * * * /1 * * *
Please refer this: docs.splunk.com/Documentation/Splunk/5.0.3/Alert/Scheduledsearch

Data Input Script Suddenly Stopped Running

Unlock Database Monitoring with Splunk Observability Cloud

Purpose in Action: How Splunk Is Helping Power an Inclusive Future for All

[Upcoming Webinar] Demo Day: Transforming IT Operations with Splunk

Join the Conversation

Data Input Script Suddenly Stopped Running

Unlock Database Monitoring with Splunk Observability Cloud

Purpose in Action: How Splunk Is Helping Power an Inclusive Future for All

[Upcoming Webinar] Demo Day: Transforming IT Operations with Splunk