Hi,
we have created some DB inputs(No. 20), it runs on every day with 24hrs frequency. Recently we observed that few of the DBInputs are getting disabling automatically. we also observed that at the time of disabling DBInputs our client accessing Splunk dashboards pages, the CPU utilization is more than "85%" and concurrent searches are more than "25".
Our system configuration:
Splunk version : 6.2.3
DBConnect Version : Splunk DBConnect v2 2.0.4
RAM memory :16 GB
No. of cores : 2
Physical Memory: 301 GB
No of concurrent users : 3
we have seen below error message in the dbx2 log:
02/08/2016 10:48:06 [CRITICAL] [mi_base.py] Caught exception Splunkd daemon is not responding: (u'Error connecting to https://127.0.0.1:8089/servicesNS/admin/splunk_app_db_connect/db_connect/connections/MSSQLDB: The read operation timed out',) in modular input mi_input://dbinput_mydbinputsample1. Disabling modular input.
for that we have increased "splundtimeout" to 3600 and max memory size to 4096 i.e. 4GB.
The memory usage is not going beyond 3GB.
please suggest us, what should be the configuration recommended for our system in order to DBInputs running with out fail.
I had this same issue with
Splunk version : 6.3.0
DBConnect Version : Splunk DBConnect v2 2.0.4
RAM memory :16 GB
No. of cores : 2
Physical Memory: 500 GB
No of concurrent users : 1
I only have 4 inputs right now. I tried quadrupling the timeout from 30s to 120s but I saw even worse performance and it took longer to get data as the underlying problem was that requests were timing out so this just made it take longer to time out and disable the inputs.
I recently updated Splunk DBConnect v2 to 2.2.0. So far, I have not seen that error occur again. It may be something that was addressed in the new update. Admittedly, there was a 500 error at the end of the update but after restarting Splunk, skipping the initial setup step that DB Connect brought up, and verifying that all inputs and connections were still correct, I have seen markedly better performance overall. I have also not even seen a retry notice for my inputs since the update, which was the precursor to the inputs being disabled.
Updating the app would be my suggested course of action.
Oh, and before the update I also made myself an alert to let me know if it ever happens again so I can turn them back on and investigate:
index=_internal sourcetype=dbx2 "disabling modular input"
It runs every 5 minutes and looks back 5 minutes so I know pretty quickly when it happens. This has not gone off since the update.
Also note that many people are saying that this error can come and go so this may not be the end-all solution.