I'm running Splunk Enterprise v 6.6.1 on Windows 2008 R2 (not by choice). Without making any configuration changes (that I'm aware of) one user has started receiving "500 internal server errors" when trying to access the Search & Reporting app. Other apps are not presenting this issue. All other users are fine. The errors are only present when UserA opens the Search & Reporting app.
The error message links to a search for index=_internal source=web_service.log requestid=[\xx]
. When looking at the log file web_service.log in notepad++, there is no matching request id.
splunkd_acces.log is not showing any errors. All the entries for 127.0.0.1 with UserA have http status 200
There are entries in splunkd_ui_access.log and web_access.log with the HTTP 500 error and matching username and timestamp, but they useful for finding the problem. They only show the GET request, user-Agent, HTTP status, and request ID (web_acces) or session ID (splunkd_ui_access).
127.0.0.1 - [username] [date&time] "GET /en-US/app/search/search HTTP/1.1" 500 3037 "" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36" - [ID] 707ms
Restarting splunkd and the user's hosts have not had any impact. Threat/Socket limit is well above what we actually use, and if they were exhausted I would expect to see errors in splunkd.log and for all users to be seeing http 500 erros.
Has anyone else experienced an issue like this? Are there any log files other than those in [$SPLUNK_HOME$]\var\log\splunk that could help?
The solution was moving the user's folder from [$SPLUNK_HOME$]/etc/user to a temp folder. They lost all of their objects, but the problem was resolved. If you want to try to save their objects, you could reassign them before clearing out their directory.
The solution was moving the user's folder from [$SPLUNK_HOME$]/etc/user to a temp folder. They lost all of their objects, but the problem was resolved. If you want to try to save their objects, you could reassign them before clearing out their directory.
To add to this.
One of the user's knowledge objects appears to be the cause of the errors. Reassigning them all may cause problems for the new account. Confirm there are no issues for the replacement account before removing the old account.
Have you tried deleting the user's directory in [$SPLUNK_HOME$]/etc/users?
I've not tried that. Do you know how that would affect the knowledge object the user has created?
I assume they'll be orphaned and can then be reassigned to the new account?
yes. Also, instead of deleting it, just move it to a temp folder.
Removed the user folder and created a new user with the same name.
User was able to log in and wasn't seeing the HTTP 500 errors. However their objects were not orphaned, just gone.
Manually restored the objects from their savedsearches.conf and local.meta files and then 500 errors returned, so I would assume there's something funky with one of their objects.
They only have a handfull of objects, the user can recreate them on their own. The SPL is all available in the .conf files.
Thank you for all your help!
I'll put it in the answer below for other folks to find.
@LCM_BRogerson, what kind of role does the user have (admin/power
or user
)? Do other users also belong to the same role as this user?
The user is part of the admin role. There are a few other users with the same role and none of them have issues.
Since other admins seem to work fine, I doubt this is it, but figured I'd mention it just in case. I took these roles away from my users, and my admins started getting error 500s
rest_apps_view
rest_properties_get
rest_properties_set