Hi,
Since upgrading to Splunk 7.1.0 from Splunk 6.5.0 I've been having issues with splunk losing access to the Web UI after some time.
I restart Splunk via /opt/splunk/bin/splunk restart and it comes back up for quite a while (usually a day) but after a period of time it will go back down again.
I've noticed that when I restart the Splunk service every time it tells me that Splunkd was not running:
splunkd 57690 was not running.
Stopping splunk helpers...
Done.
Stopped helpers.
Removing stale pid file... done.
splunkd is not running.
Splunk> The IT Search Engine.
Checking prerequisites...
Checking http port [8000]: open
Checking mgmt port [8089]: open
Checking appserver port [127.0.0.1:8065]: open
Checking kvstore port [8191]: open
Checking configuration... Done.
Checking critical directories... Done
Checking indexes...
Validated: _audit _internal _introspection _telemetry _thefishbucket bro cim_modactions cim_summary endpoint_summary firedalerts history ioc main msexchange notable notable_summary os perfmon risk summary threat_activity ubaroute ueba whois windows wineventlog xtreme_contexts
Done
Checking filesystem compatibility... Done
Checking conf files for problems...
Invalid key in stanza [syslog:ubaroute] in /opt/splunk/etc/apps/Splunk_TA_ueba/default/outputs.conf, line 7: dropEventsOnQueueFull (value: 10).
Your indexes and inputs configurations are not internally consistent. For more information, run 'splunk btool check --debug'
Done
Checking default conf files for edits...
Validating installed files against hashes from '/opt/splunk/splunk-7.1.0-2e75b3406c5b-linux-2.6-x86_64-manifest'
All installed files intact.
Done
All preliminary checks passed.
Starting splunk server daemon (splunkd)...
Done
Waiting for web server at https://127.0.0.1:8000 to be available............... Done
I didn't see anything that stood out in Splunkd.log or Web_access.log - though in syslog I saw the following:
Out of memory: Kill process 31605 (splunkd) score 484 or sacrifice child
May 23 08:05:42 splunk kernel: [244114.815142] Killed process 31605 (splunkd) total-vm:9945680kB, anon-rss:8845420kB, file-rss:0kB
This is becoming quite an issue - any help would be appreciated.
It is quite difficult to know exactly what is the problem, but I saw this problem once when there was a conflict with the bucket IDs. Have you scanned for all errors in the splunkd.log?
Hi,
These are the most recent errors in Splunkd.log that happened around this period:
15 1 05-23-2018 09:28:12.968 -0600 ERROR HttpListener - Handler for /en-US/modules/@4E8ECBCF7F0F0D7AD1FA3361342436F3A19E6A37E099573C0F7432A76B5B12A4/modules-17be7a83a6e3c3c3e5360ea69841a46394a8d1aa.min.css sent a 0 byte response after earlier claiming a Content-Length of 307!
16 1 05-23-2018 09:28:12.968 -0600 ERROR HttpListener - Exception while processing request from 172.20.20.74 for /en-US/modules/@4E8ECBCF7F0F0D7AD1FA3361342436F3A19E6A37E099573C0F7432A76B5B12A4/modules-17be7a83a6e3c3c3e5360ea69841a46394a8d1aa.min.css: Connection closed by peer
17 1 05-23-2018 08:32:52.114 -0600 ERROR HttpListener - Handler for /en-US/static/@4E8ECBCF7F0F0D7AD1FA3361342436F3A19E6A37E099573C0F7432A76B5B12A4/fonts/inconsolata-regular.woff sent a 0 byte response after earlier claiming a Content-Length of 32744!
18 1 05-23-2018 08:32:52.114 -0600 ERROR HttpListener - Exception while processing request from 172.20.20.74 for /en-US/static/@4E8ECBCF7F0F0D7AD1FA3361342436F3A19E6A37E099573C0F7432A76B5B12A4/fonts/inconsolata-regular.woff: Connection closed by peer
19 1 05-23-2018 08:30:11.128 -0600 ERROR KVStoreAdminHandler - An error occurred.
20 1 05-23-2018 08:30:11.128 -0600 ERROR KVStorageProvider - An error occurred during the last operation ('replSetGetStatus', domain: '15', code: '13053'): No suitable servers found (`serverSelectionTryOnce` set): [connection closed calling ismaster on '127.0.0.1:8191']
21 1 05-23-2018 08:29:40.816 -0600 ERROR AdminManagerExternal - External handler failed with code '1' and output: 'REST ERROR[500]: Splunkd internal error - Fail to get capabilities of sessioned user'. See splunkd.log for stderr output.
22 1 05-23-2018 08:29:40.806 -0600 ERROR ScriptRunner - stderr from '/opt/splunk/bin/python /opt/splunk/bin/runScript.py execute': BaseException: REST ERROR[500]: Splunkd internal error - Fail to get capabilities of sessioned user
23 1 05-23-2018 08:29:40.806 -0600 ERROR ScriptRunner - stderr from '/opt/splunk/bin/python /opt/splunk/bin/runScript.py execute': msgx='Fail to get capabilities of sessioned user',
24 1 05-23-2018 08:29:40.805 -0600 ERROR ScriptRunner - stderr from '/opt/splunk/bin/python /opt/splunk/bin/runScript.py execute': File "/opt/splunk/lib/python2.7/site-packages/splunk/admin.py", line 128, in init
25 1 05-23-2018 08:29:40.805 -0600 ERROR ScriptRunner - stderr from '/opt/splunk/bin/python /opt/splunk/bin/runScript.py execute': admin.init(base.ResourceHandler(Servers), admin.CONTEXT_APP_AND_USER)
26 1 05-23-2018 08:29:40.805 -0600 ERROR ScriptRunner - stderr from '/opt/splunk/bin/python /opt/splunk/bin/runScript.py execute': File "/opt/splunk/etc/apps/Splunk_TA_nessus/bin/ta_tenable_rh_sc_servers.py", line 24, in <module>
27 1 05-23-2018 08:29:40.805 -0600 ERROR ScriptRunner - stderr from '/opt/splunk/bin/python /opt/splunk/bin/runScript.py execute': File "/opt/splunk/bin/runScript.py", line 78, in <module>
28 1 05-23-2018 08:29:40.805 -0600 ERROR ScriptRunner - stderr from '/opt/splunk/bin/python /opt/splunk/bin/runScript.py execute': Traceback (most recent call last):
29 1 05-23-2018 07:39:49.108 -0600 ERROR HttpListener - Handler for /en-US/app/SplunkEnterpriseSecuritySuite/ess_security_posture?hideEdit=true&hideTitle=true&hideSplunkBar=true&hideAppBar=true&targetTop=true sent a 0 byte response after earlier claiming a Content-Length of 4650!
30 1 05-23-2018 07:39:49.108 -0600 ERROR HttpListener - Exception while processing request from 172.20.20.74 for /en-US/app/SplunkEnterpriseSecuritySuite/ess_security_posture?hideEdit=true&hideTitle=true&hideSplunkBar=true&hideAppBar=true&targetTop=true: Connection closed by peer
Apologies for the wall of text!
Thanks.
2 things I would check: