I have two search heads with search head pooling setup. The search head pool is on a NFS filer, and that has been working well for quite sometime. However, over the weekend, the NFS filer has failed. As a temporary solution, I have decided to elect one of the search head Linux host servers to also acts as NFS filer for the search head pool, knowing that it's probably not as fast (disk i/o wise) as the actual filer.
However, since after that change is made, my users are not able to login to splunk via splunkweb UI on both search heads. They kept getting "Splunkd daemon is not responding" message in the UI. After digging through the logs, I did find the following messages in the web_service.log:
2014-08-22 23:54:14,879 ERROR [53f83a78bb1c5284d0] __init__:468 - Socket error communicating with splunkd (error=The read operation timed out), path = /servicesNS/galens/search/saved/searches
2014-08-22 23:54:14,879 ERROR [53f83a78bb1c5284d0] decorators:379 - Splunkd daemon is not responding: ('Error connecting to /servicesNS/admin/search/saved/searches: The read operation timed out',)
Traceback (most recent call last):
File "/opt/splunk/lib/python2.7/site-packages/splunk/appserver/mrsparkle/lib/decorators.py", line 365, in handle_exceptions
return fn(self, *a, **kw)
File " ", line 1, in
File "/opt/splunk/lib/python2.7/site-packages/splunk/appserver/mrsparkle/lib/decorators.py", line 420, in apply_cache_headers
response = fn(self, *a, **kw)
File "/opt/splunk/lib/python2.7/site-packages/splunk/appserver/mrsparkle/controllers/view.py", line 1007, in render
can_alert, searches = self.get_saved_searches(app)
File "/opt/splunk/lib/python2.7/site-packages/splunk/appserver/mrsparkle/controllers/view.py", line 870, in get_saved_searches
searches = en.getEntities('saved/searches', namespace=app, search='is_visible=1 AND disabled=0', count=500, _with_new='1')
File "/opt/splunk/lib/python2.7/site-packages/splunk/entity.py", line 129, in getEntities
atomFeed = _getEntitiesAtomFeed(entityPath, namespace, owner, search, count, offset, sort_key, sort_dir, sessionKey, uri, hostPath, **kwargs)
File "/opt/splunk/lib/python2.7/site-packages/splunk/entity.py", line 222, in _getEntitiesAtomFeed
serverResponse, serverContent = rest.simpleRequest(uri, getargs=kwargs, sessionKey=sessionKey, raiseAllErrors=True)
File "/opt/splunk/lib/python2.7/site-packages/splunk/rest/__init__.py", line 469, in simpleRequest
raise splunk.SplunkdConnectionException, 'Error connecting to %s: %s' % (path, str(e))
SplunkdConnectionException: Splunkd daemon is not responding: ('Error connecting to /servicesNS/admin/search/saved/searches: The read operation timed out',)
Could slow search head pool be the root cause? How can I temporary workaround this issue?
... View more