I realize this will be simple for someone with more experience than I have. Running 2 search heads, 2 indexers, management server feeding from our centralized rsyslog server for all information. 1 head and indexer is dedicated to Enterprise security (ES). On the ES search head only--I am getting the following set of "error messages" hourly.
A simple search using source=*python_modular_input.log index=_internal configuration_check.py
shows:
2015-02-09 13:00:00,591 INFO pid=17738 tid=MainThread file=configuration_check.py::184 | status="starting"
2015-02-09 13:00:00,592 INFO pid=17738 tid=MainThread file=__init__.py:execute:883 | Execute called
2015-02-09 13:00:00,596 INFO pid=17738 tid=MainThread file=configuration_check.py:run:85 | status="executing"
2015-02-09 13:00:00,596 INFO pid=17738 tid=MainThread file=configuration_check.py:run:91 | status="retrieved task" task="confcheck_script_errors"
2015-02-09 13:00:00,690 INFO pid=17738 tid=MainThread file=configuration_check.py:run:101 |
status="enabled UI message suppression" task="confcheck_script_errors"
pattern="((streamfwd|splunk-(wmi\.path|MonitorNoHandle\.exe|winevtlog\.exe|netmon\.exe|perfmon\.exe|regmon\.exe|winprintmon\.exe|admon\.exe)).
*exited with code 1)"
2015-02-09 13:00:00,719 ERROR pid=17738 tid=MainThread file=configuration_check.py:run:153 |
status="completed" task="confcheck_script_errors" message="msg="A script exited abnormally"
input="/opt/splunk/etc/apps/SA-Utils/bin/configuration_check.py"
stanza="configuration_check://confcheck_related_searches_not_enabled"
status="exited with code 2""
2015-02-09 13:00:00,743 INFO pid=17738 tid=MainThread file=configuration_check.py:run:180 | status="exiting" exit_status="0"
with the error msg showing up at the console. I have chased this through every log file, found the inputs.conf, tried btool, but nothing leads to what exactly this is looking for. Which "related" search is not there?
I would appreciate any help in resolving this; however, I would also like the "why & how" so I will be able to chase this down in the future. We do have SoS but it is on the general search head which makes the ES search head the only server it cannot see.
Looking at the log message on the initial post, that does not appear to be an error with a search not being enabled. The script is actually erroring out with a non-zero exit code: "exited with code 2". Looking at configuration_check.py, the exit code "2" corresponds to "ERR_REST_EXC", indicating that there was an exception making a REST call as part of the configuration check. This indicates some other issue with the system. In this case the exception should be logged to configuration_check.log and I would begin looking there for answers, specifically for Python tracebacks that correspond in time to the log messages in python_modular_input.log.
If it helps, we were able to find our issue with the following search:
index=_internal configuration_check rest sourcetype=python_modular_input
In our case, we had a correlation search that was created but something happened - maybe during creation or replication (sh cluster) that kind of hung. Someone tried to delete the search in Slunk Web which looked like it worked. But in fact, it was still there on the back end. Once found, we deleted it and restarted splunk. That resolved it for us.
Thanks,
That's correct and is probably how this problem gets introduced most frequently, although it's probably not the only way it can be introduced. Deletion of Enterprise Security correlation searches via the normal Splunk Manager GUI is not supported because a Correlation Search consists of several configuration file artifacts - stanzas in savedsearches.conf and correlationsearches.conf. The normal Splunk Manager page for managing searches is not aware of the other configuration files. So, deleting a Correlation Search via the unapproved mechanism will leave the system in an inconsistent state. All management of Correlation Searches is expected to occur through the configuration pages in the Enterprise Security app - which do not currently support deletion to my knowledge, only disabling.
We've rectified the error in our configuration check in the next ES release to more properly reflect the condition - we now warn that you have an "orphaned" stanza in correlationsearches.conf, possibly as a result of an attempt to manage a Correlation Search via the unapproved mechanism.
Thanks and Kudos to the support people at Splunk--I believe there has been a resolution I probably would have never found.
In this case the app causing the friction was DA-ESS-IdentityManagement so in the /etc/apps/DA-ESS-IdentityManagement/local directory was a couple of files with one being correlationsearches.conf which had an "extra" stanza in it for the non-existant or orphaned search. Removal of that file followed by a reload command;
curl -k -u admin https://your.splunk.server.here:8089/services/alerts/correlationsearches/_reload
hopefully has solved this issue. Will know tomorrow.
Hi @dschmidt_cfi
Just to follow up and confirm, did this solve your issue? If yes, please be sure to accept the answer to resolve this question so other users with the same problem can find this post with a concrete solution. Thanks!
Patrick
Tried and failed. Then again...I am going to retrace my steps again this afternoon. When the correct solution is found I will be sure to give credit to whomever found it because I am pretty sure it will not be me (at this point in time.)
No problem, glad you found a solution through @jervin_splunk 🙂
Actually, it was me that failed. When the messages returned the following morning at 3am I thought the process failed but not until I did another follow up check did I realize it was the same message because of a different correlation search. rinse & repeat the steps and now it has cleared everything up.
Once again, Thanks to everyone.
Looking at the log message on the initial post, that does not appear to be an error with a search not being enabled. The script is actually erroring out with a non-zero exit code: "exited with code 2". Looking at configuration_check.py, the exit code "2" corresponds to "ERR_REST_EXC", indicating that there was an exception making a REST call as part of the configuration check. This indicates some other issue with the system. In this case the exception should be logged to configuration_check.log and I would begin looking there for answers, specifically for Python tracebacks that correspond in time to the log messages in python_modular_input.log.
Thanks for pointing out a new avenue of searching. I still do not have an answer from Splunk but with your answer I changed my search and found;
2015-02-15 03:08:03,287 ERROR pid=22083 tid=MainThread file=configuration_check.py:run:168 | status="RESTException when executing configuration check" exc="[HTTP 404] https://127.0.0.1:8089/servicesNS/dhorn/DA-ESS-IdentityManagement/saved/searches/Access%20-%20Intera...; [{'code': None, 'text': "\n In handler 'savedsearch': Could not find object id=Access - Interactive Logon by a Service Account - Rule", 'type': 'ERROR'}]"
Traceback (most recent call last):
and the traceback which gives a new trail to go hunting down.
(Splitting reply in half due to arbitrary character limits)
There are probably two errors here:
I would recommend running these two queries via CURL on your system and appending them to the support ticket if you have already created one. Also feel free to send me the salesforce ticket number directly; my username minus the "_splunk" @splunk.com will get to me.
curl -k -u admin https://127.0.0.1:8089/servicesNS/dhorn/DA-ESS-IdentityManagement/saved/searches/Access%20-%20Intera...
curl -k -u admin https://127.0.0.1:8089/servicesNS/nobody/DA-ESS-IdentityManagement/saved/searches/Access%20-%20Inter...
Also, if a diag hasn't been created for this issue, submit one as we would like to see the configuration files from the system, if possible.
Everyone helped but Jervin had the correct solution
Were you able to figure out what was causing that to continually fire? We just noticed it in our environment this morning as well - same message showing up on the hour. We're still in the middle of our implementation and will reach out to our PS resource as well.