Monitoring Splunk

Splunk Daemon Not Responding

SplunkTrust
SplunkTrust

Greetings! So, we are running 5.0.3 in SHP (2 SHs) with SSO=permissive. I get this error:


2013-06-06 16:06:41,656 ERROR [51b0ebb39d7fb184803e90] search:221 - Splunkd daemon is not responding: ('The read operation timed out',)
Traceback (most recent call last):
File "/opt/splunk/lib/python2.7/site-packages/splunk/appserver/mrsparkle/controllers/search.py", line 218, in dispatchJob
job = splunk.search.dispatch(q, sessionKey=cherrypy.session['sessionKey'], **options)
File "/opt/splunk/lib/python2.7/site-packages/splunk/search/init.py", line 268, in dispatch
serverResponse, serverContent = rest.simpleRequest(uri, postargs=args, sessionKey=sessionKey, rawResult=True)
File "/opt/splunk/lib/python2.7/site-packages/splunk/rest/init.py", line 446, in simpleRequest
raise splunk.SplunkdConnectionException, str(e)
SplunkdConnectionException: Splunkd daemon is not responding: ('The read operation timed out',)

I added this line to the __init__.py file in /opt/splunk/lib/python2.7/site-packages/splunk/rest.

logger.error('problem=splunkd_socket_connection_exception msg="%s" aTry=%s tries=%s wait=%s uri="%s" method=%s headers="%s" body="%s" serverResponse="%s" sessionSource="%s" proxyMode="%s" http_vars="%s" http_dir="%s" webkeyfile="%s" webcertfile="%s" error_dir="%s" pprint_error="%s" '%(e, aTry, tries, wait, uri, method, headers, payload, serverResponse, sessionSource, proxyMode, pprint(vars(h)), dir(h), str(getWebKeyFile()), str(getWebCertFile), dir(e), pprint(vars(e)) ) )

It outputs this:


2013-06-06 16:06:41,655 ERROR [51b0ebb39d7fb184803e90] init:445 - problem=splunkd_socket_connection_exception msg="The read operation timed out" aTry=0 tries=4 wait=10 uri="https://127.0.0.1:8089/servicesNS/USER/search/search/jobs" method=POST headers="{'Authorization': 'Splunk AUTHKEY'}" body="latest_time=1370542605.17&ui_dispatch_app=search&ui_dispatch_view=flashtimeline&max_count=10000&search=search%20index%3D_internal%20host%3Dhsearchp01%20sourcetype%3Dsplunk_web_service%20earliest%3D-2m%40m&earliest_time=1370542604&auto_cancel=100&required_field_list=%2A&time_format=%25s.%25Q&status_buckets=300" serverResponse="bullpucky" sessionSource="direct" proxyMode="False" http_vars="None" http_dir="['class', 'delattr', 'dict', 'doc', 'format', 'getattribute', 'hash', 'init', 'module', 'new', 'reduce', 'reduce_ex', 'repr', 'setattr', 'sizeof', 'str', 'subclasshook', 'weakref', 'auth_from_challenge', '_conn_request', '_normalize_headers', '_request', 'add_certificate', 'add_credentials', 'authorizations', 'ca_certs', 'cache', 'certificates', 'clear_credentials', 'connections', 'credentials', 'disable_ssl_certificate_validation', 'follow_all_redirects', 'follow_redirects', 'force_exception_to_status_code', 'ignore_etag', 'optimistic_concurrency_methods', 'proxy_info', 'request', 'timeout']" webkeyfile="None" webcertfile="" error_dir="['class', 'delattr', 'dict', 'doc', 'format', 'getattribute', 'getitem', 'getslice', 'hash', 'init', 'module', 'new', 'reduce', 'reduce_ex', 'repr', 'setattr', 'setstate', 'sizeof', 'str', 'subclasshook', 'unicode', 'weakref_', 'args', 'errno', 'filename', 'message', 'strerror']" pprint_error="None"

I now don't know where else to check for issues. I thought this was fixed in 5.0.3 (SPL-66828), unless this is something else. The aTry variable is supposed to count the number of tries. It never gets past 0, which means the socket error happens before a second try!

Tags (3)
0 Karma
1 Solution

SplunkTrust
SplunkTrust

We are now running 6.0.3. So this no longer applies to me, however I think the root cause was Disk I/O on the server.

View solution in original post

0 Karma

SplunkTrust
SplunkTrust

We are now running 6.0.3. So this no longer applies to me, however I think the root cause was Disk I/O on the server.

View solution in original post

0 Karma
State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!