Splunk Enterprise Security

Why am I getting error "msg="A script exited abnormally" input="./bin/collector.path"" every hour?

Explorer

Every hour I receive the error:

msg="A script exited abnormally" input="./bin/collector.path" stanza="default" status="exited with code 24

Looking in data inputs -> scripts, I see that collector.path comes from app "introspection_generator_addon"

It indeed is in /opt/splunk/etc/apps/introspection_generator_addon/bin and has the line:

$SPLUNK_HOME/bin/splunkd" instrument-resource-usage

But I don't think the issue comes from there. Continuing to investigate the logs, in splunkd.log i see:

configuration_check.log:2015-12-02 18:00:00,237 ERROR pid=25487 tid=MainThread file=configuration_check.py:run:160 | status="completed" task="confcheck_script_errors" message="msg="A script exited abnormally" input="./bin/collector.path" stanza="default" status="exited with code 24""

Doing a find of configuration_check.py, I see it coming from:

/opt/splunk/etc/apps/SA-Utils/bin/configuration_check.py

which is an app installed with Splunk Enterprise Security.

When running the script, I get:

python /opt/splunk/etc/apps/SA-Utils/bin/configuration_check.py
Traceback (most recent call last):
  File "/opt/splunk/etc/apps/SA-Utils/bin/configuration_check.py", line 14, in 
    from splunk.appserver.mrsparkle.lib.util import make_splunkhome_path
  File "/opt/splunk/lib/python2.7/site-packages/splunk/appserver/__init__.py", line 13, in 
    from splunk.clilib import bundle_paths
  File "/opt/splunk/lib/python2.7/site-packages/splunk/clilib/bundle_paths.py", line 6, in 
    import splunk.clilib.cli_common as comm
  File "/opt/splunk/lib/python2.7/site-packages/splunk/clilib/cli_common.py", line 6, in 
    import lxml.etree as etree
ImportError: /usr/lib64/libxml2.so.2: version `LIBXML2_2.9.0' not found (required by /opt/splunk/lib/python2.7/site-packages/lxml/etree.so)

Checking the install version of libxml2 on the system:

rpm -qa | grep libxml
libxml2-2.7.6-20.el6.x86_64
libxml2-python-2.7.6-20.el6.x86_64

So is the script faling due to the version of libxml2?

The system running is a RHEL 6.5 and no version 2.9.0 is available.

Explorer

We had the exact same error showing up for a while and just sorted it out yesterday with some help from the #splunk IRC denizens (100% credit goes to ^Okie^). I'll put more info on our specific (very weird) issue below and add some info on why the libxml2 error you received is misleading. The underlying problem we encountered was a test instrument-resource-usage runs at startup failing, so it would start, silently fail, and loop ad infinitum. Then once an hour configuration_check.py would wander over and throw an entry in the logs saying there was an issue.

That said, do you get any useful console output from running these two commands?

source /opt/splunk/bin/setSplunkEnv
/opt/splunk/bin/splunkd instrument-resource-usage

In our case, we got back

splunk@es_host:~$ source /opt/splunk/bin/setSplunkEnv
Tab-completion of "splunk <verb> <object>" is available.
splunk@es_host:~$ /opt/splunk/bin/splunkd instrument-resource-usage
INFO  RU_main - I-data gathering (Resource Usage) starting; period=10s
Cond 'swapSize_free <= ru._swap.get()' false; line 443, collect_hostwide__Linux()
splunk@es_host:~$

Which prompted a closer-than-eyeballing look at the VM's swap usage. Lo and behold, somehow the VM had blown through its swap partition and claimed it was using -15360/2048 MB swap, which the host system confirmed as 17 GB swap in use with 2 GB allocated. Still trying to sort out how that happened, but clearing swap so the system didn't have clearly illegal numbers in play solved it for us and the hourly error from configuration_check.py hasn't come back.

I'd guess (hope) your issue is something else, but probably still in the realm of a resource issue or something else causing instrument-resource-usage to fail. There's hopefully something useful in the script output, but if not then iterate through checking various resources - make sure you haven't run into the file handle ceiling from ulimit, etc.


LibXML2 / Python Library Notes

The version of libxml2 on the system doesn't actually matter here. Splunk ships with and runs from its own bundled Python and libraries, and scripts it invokes should be using the same.

So while RHEL has this:

splunk@es_host:~$ cat /etc/redhat-release
Red Hat Enterprise Linux Server release 6.8 (Santiago)
splunk@es_host:~$ ls -l /usr/lib64/libxml2.so.2
lrwxrwxrwx 1 root root 16 Dec 19 16:31 /usr/lib64/libxml2.so.2 -> libxml2.so.2.7.6

Splunk uses:

splunk@es_host:~$ ls -l /opt/splunk/lib/libxml2.so
lrwxrwxrwx 1 splunk splunk 16 Feb 12  2016 /opt/splunk/lib/libxml2.so -> libxml2.so.2.9.2

If you want to run configuration_check.py you can source an environment-setting file and have splunk run the script, but your output will likely just contain the same error as before:

splunk@es_host:~$ source /opt/splunk/bin/setSplunkEnv
Tab-completion of "splunk <verb> <object>" is available.
splunk@es_host:~$ splunk cmd splunkd print-modinput-config configuration_check configuration_check://confcheck_script_errors | splunk cmd python $SPLUNK_HOME/etc/apps/SA-Utils/bin/configuration_check.py --username=admin
Splunk password:
splunk@es_host:~$ tail -7 /opt/splunk/var/log/splunk/configuration_check.log
2016-12-20 10:25:34,449 INFO pid=26725 tid=MainThread file=configuration_check.py:<module>:199 | status="starting"
2016-12-20 10:25:35,433 INFO pid=26737 tid=MainThread file=configuration_check.py:<module>:199 | status="starting"
2016-12-20 10:25:39,809 INFO pid=26725 tid=MainThread file=configuration_check.py:run:89 | status="executing"
2016-12-20 10:25:39,809 INFO pid=26725 tid=MainThread file=configuration_check.py:run:96 | status="retrieved task" task="confcheck_script_errors"
2016-12-20 10:25:39,884 INFO pid=26725 tid=MainThread file=configuration_check.py:run:106 | status="enabled UI message suppression" task="confcheck_script_errors" pattern="((streamfwd|splunk-(wmi\.path|MonitorNoHandle\.exe|winevtlog\.exe|netmon\.exe|perfmon\.exe|regmon\.exe|winprintmon\.exe|admon\.exe)).*exited with code 1)"
2016-12-20 10:25:39,889 INFO pid=26725 tid=MainThread file=configuration_check.py:run:141 | status="retrieved_checkpoint_data" task="confcheck_script_errors"
2016-12-20 10:25:39,915 INFO pid=26725 tid=MainThread file=configuration_check.py:run:195 | status="exiting" exit_status="0"