Seeing the error
ERROR ProcessDispatchedSearch - PROCESS_SEARCH - Error opening "": No such file or directory
a lot of these message on my search head's splunkd.log using a search head pooling configuration. It seems that this might be causing results to be incomplete or for searches to not run which can be noted with the additional mess
Failed to create a bundles setup with server name '
Are you servers having any clock issues ?
I saw that in cases like :
- search-head pooling with a shared storage
- mounted bundle with shared storage
- single server but with a clock changing all the time
In the last case, the root cause was that the sysadmin decided to sync the clock across his deployment with a cronjob calling ntpdate every 5 minutes ....
This is not a good idea, because Splunk (and many others applications) rely on the clock to compare files modification time, or durations, and ntpdate change the clock immediately (so you can be several seconds in the future or in the past, without notice).
see http://superuser.com/questions/444733/linux-ntpd-and-ntpdate-service
Are you servers having any clock issues ?
I saw that in cases like :
- search-head pooling with a shared storage
- mounted bundle with shared storage
- single server but with a clock changing all the time
In the last case, the root cause was that the sysadmin decided to sync the clock across his deployment with a cronjob calling ntpdate every 5 minutes ....
This is not a good idea, because Splunk (and many others applications) rely on the clock to compare files modification time, or durations, and ntpdate change the clock immediately (so you can be several seconds in the future or in the past, without notice).
see http://superuser.com/questions/444733/linux-ntpd-and-ntpdate-service
In my personal experience, using ntpd doesn't completely eliminate the issue. Granted that the frequency goes down from every 10 minutes (when I cron every hour), to a few of them using ntpd. Maybe the timestamp checking is just too strict?
In general, frequent ntpdate (every 5 minutes) should not cause significant jumps, so you would hope software would be resilient (we should file bugs if we are not). However there is no good reason to run ntpdate frequently, as the resources required by software to handle the jumps will be greater than that saved by not running ntpd constantly, and ntpd will give a monotonically increasing clock so the software will be able to behave more correctly.