Deployment Architecture

500 Internal Server Error

Wiggy
Splunk Employee
Splunk Employee

Running latest 5.x on my search head and have noticed lately that more and more users are randomly getting a "500 Internal Server Error" when trying to access or edit saved searches.

alt text

What would cause this and how do I fix the issue so that users do not get the 500 error?

1 Solution

Wiggy
Splunk Employee
Splunk Employee

Most likely this is due to the instance being busy and responding to the rest endpoint query is taking longer than the default time that Splunkweb waits for a response, which is 30 seconds by default.

This can happen more frequently on search heads that service many users or have a large number of scheduled searches that are running in the background with a large dispatch directory.

Looking in the web_service.log will show results like these:

2013-09-25 18:52:11,076 ERROR   [5243921ef67fd8b01eae10] search:227 - Splunkd daemon is not responding: ('The read operation timed out',)
Traceback (most recent call last):
File "/opt/splunk/lib/python2.7/site-packages/splunk/appserver/mrsparkle/controllers/search.py", line 224, in dispatchJob
job = splunk.search.dispatch(q, sessionKey=cherrypy.session['sessionKey'], **options)
File "/opt/splunk/lib/python2.7/site-packages/splunk/search/__init__.py", line 268, in dispatch
serverResponse, serverContent = rest.simpleRequest(uri, postargs=args, sessionKey=sessionKey, rawResult=True)
File "/opt/splunk/lib/python2.7/site-packages/splunk/rest/__init__.py", line 443, in simpleRequest
raise splunk.SplunkdConnectionException, str(e)
SplunkdConnectionException: Splunkd daemon is not responding: ('The read operation timed out',)

There is also a helpful view in the Splunk S.O.S. app called "HTTP Response Times For splunkd" that will show more detail on response times and what is being accessed:

alt text

Every time that the response is longer that 30 seconds, you will get a 500 error when trying to access that object.

To allow Splunkweb to wait for a longer period of time that the default 30 seconds, you can edit the

$Splunk_home/lib/python2.7/site-packages/splunk/rest/__init__.py 

file and change the value of the following line:

SPLUNKD_CONNECTION_TIMEOUT = 30

to a value that is suitable for your response times. In the above case, changing this value to 50 or 60 would probably work since we had a few instances where it took 30 - 40 seconds. Once this is edited and saved, you will need to restart the Splunk instance in order for the change to take effect and then you should no longer get the 500 errors as before.

View solution in original post

Wiggy
Splunk Employee
Splunk Employee

Most likely this is due to the instance being busy and responding to the rest endpoint query is taking longer than the default time that Splunkweb waits for a response, which is 30 seconds by default.

This can happen more frequently on search heads that service many users or have a large number of scheduled searches that are running in the background with a large dispatch directory.

Looking in the web_service.log will show results like these:

2013-09-25 18:52:11,076 ERROR   [5243921ef67fd8b01eae10] search:227 - Splunkd daemon is not responding: ('The read operation timed out',)
Traceback (most recent call last):
File "/opt/splunk/lib/python2.7/site-packages/splunk/appserver/mrsparkle/controllers/search.py", line 224, in dispatchJob
job = splunk.search.dispatch(q, sessionKey=cherrypy.session['sessionKey'], **options)
File "/opt/splunk/lib/python2.7/site-packages/splunk/search/__init__.py", line 268, in dispatch
serverResponse, serverContent = rest.simpleRequest(uri, postargs=args, sessionKey=sessionKey, rawResult=True)
File "/opt/splunk/lib/python2.7/site-packages/splunk/rest/__init__.py", line 443, in simpleRequest
raise splunk.SplunkdConnectionException, str(e)
SplunkdConnectionException: Splunkd daemon is not responding: ('The read operation timed out',)

There is also a helpful view in the Splunk S.O.S. app called "HTTP Response Times For splunkd" that will show more detail on response times and what is being accessed:

alt text

Every time that the response is longer that 30 seconds, you will get a 500 error when trying to access that object.

To allow Splunkweb to wait for a longer period of time that the default 30 seconds, you can edit the

$Splunk_home/lib/python2.7/site-packages/splunk/rest/__init__.py 

file and change the value of the following line:

SPLUNKD_CONNECTION_TIMEOUT = 30

to a value that is suitable for your response times. In the above case, changing this value to 50 or 60 would probably work since we had a few instances where it took 30 - 40 seconds. Once this is edited and saved, you will need to restart the Splunk instance in order for the change to take effect and then you should no longer get the 500 errors as before.

Get Updates on the Splunk Community!

Building Reliable Asset and Identity Frameworks in Splunk ES

 Accurate asset and identity resolution is the backbone of security operations. Without it, alerts are ...

Cloud Monitoring Console - Unlocking Greater Visibility in SVC Usage Reporting

For Splunk Cloud customers, understanding and optimizing Splunk Virtual Compute (SVC) usage and resource ...

Automatic Discovery Part 3: Practical Use Cases

If you’ve enabled Automatic Discovery in your install of the Splunk Distribution of the OpenTelemetry ...