Alerting

Data Input Script Suddenly Stopped Running

drautb
Explorer

Hey all,

The Splunk instance that I work with has several data input scripts. (~30) One of them is scheduled to run hourly, it's cron string looks like this: "0 * * * *" It was working great, but it abruptly stopped running for some reason. The last time it ran, (as determined by the timestamp on it's output files) was June 30th at 11:00pm. I restarted splunk, and it started running again, but I still haven't been able to determine what caused it to stop in the first place.

Because of the timing, I thought it might be an error in my cron string, but everything I have found online says that the string is correct. Has anyone else run into this before? Scripts that abruptly stop running?

0 Karma

bmas10
Explorer

I am seeing the same issue. It will just randomly get removed from the schedule. Debug refresh will cause it to get added to the schedule again, but it will not start running again until I disable and re-enable. I have checked the _internal and opened up the splunkd.log locally. No errors. I have performed the basic troubleshooting listed by grijhwani and no luck.

0 Karma

grijhwani
Motivator

Do you understand the script, or are you simply relying on a legacy job instated by someone else? You give no indication as to what the script is, or your underlying platform, so it is hard to answer the question. A number of generic possibilities:

  • Have you tried running the script manually? Does it even run any more?
  • Has someone screwed your source permissions (or indeed the run permissions of the script) in such a fashion that the two are now incompatible?
  • Has the data the script works from become corrupted, or too large for whatever tool you are using to manage?
  • Have there been any updates to the system? Perhaps an update of some tool critical to the success of the script has been broken by the update.
  • Have you checked for disk space problems wherever it is that the script places any intermediate or output files?
  • Have you checkpointed the script? If it is running on unix you could try "touch"ing semaphore files to track progress in the background.

In short - have you done any debugging before assuming it is the scheduling which has broken?

0 Karma

lguinn2
Legend

You could find out more by looking in the Splunk logs. Log into Splunk as a admin, and run this search

index=_internal nameofyourscript

You should be able to see what happened each time that Splunk attempted to run your script.
You can find this same information in $SPLUNK_HOME/var/log/splunk/splunkd.log*

0 Karma

grijhwani
Motivator

The cron notation is just fine - the zeroth minute of every hour. And besides, if you didn't change it why would it be wrong now when it wasn't before?

drautb
Explorer

I just fixed the cron string in my post, I didn't realize the website removed some of the asterisks. Is yours correct? I'm not seeing how * * /1 * would equate to an hourly job.

0 Karma

linu1988
Champion

The cron schedule should be like this:
* * * * /1 * * *
Please refer this: docs.splunk.com/Documentation/Splunk/5.0.3/Alert/Scheduledsearch

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...