Hi all,
Today I've updated Splunk from version 9.2.2 to 9.3.0.
All seems to be good except Alert Manager Enterprise 3.0.8 ( is not working anymore.)
I'm kinda new into splunk, so I don't know where to start.
The error we got is:
Unable to initialize modular input "tag_keeping" defined in the app "alert_manager_enterprise":
Introspecting scheme=tag_keeping: script running failed (PID 4085525 exited with code1)
Please help me 🙂
Kind regards,
Glenn
So I think I figured it out.
/opt/splunk/bin/splunk cmd python3 house_keeping.py --scheme
Traceback (most recent call last):
File "/opt/splunk/etc/apps/alert_manager_enterprise/bin/house_keeping.py", line 21, in <module>
from datapunctum.factory_logger import Logger
File "/opt/splunk/etc/apps/alert_manager_enterprise/bin/../lib/datapunctum/factory_logger.py", line 10, in <module>
from pydantic import BaseModel
File "/opt/splunk/etc/apps/alert_manager_enterprise/bin/../lib/pydantic/__init__.py", line 372, in __getattr__
module = import_module(module_name, package=package)
File "/opt/splunk/lib/python3.9/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "/opt/splunk/etc/apps/alert_manager_enterprise/bin/../lib/pydantic/main.py", line 11, in <module>
import pydantic_core
File "/opt/splunk/etc/apps/alert_manager_enterprise/bin/../lib/pydantic_core/__init__.py", line 6, in <module>
from ._pydantic_core import (
ModuleNotFoundError: No module named 'pydantic_core._pydantic_core'
However
/opt/splunk/bin/splunk cmd python3.7 house_keeping.py --scheme
<scheme><title>AME house keeping</title><description>This task does everything, that has to be done in the background, like mapping users and checking the events time to live.</description><use_external_validation>true</use_external_validation><use_single_instance>false</use_single_instance><streaming_mode>xml</streaming_mode><endpoint><args><arg name="default"><description>Unused Default Argument</description><data_type>string</data_type><required_on_edit>false</required_on_edit><required_on_create>true</required_on_create></arg></args></endpoint></scheme>splunk@gdovm:~/etc/apps/alert_manager_enterprise/bin$
And then there is this: https://docs.splunk.com/Documentation/Splunk/9.3.0/ReleaseNotes/MeetSplunk
n this release, the default Python interpreter is set to Python version 3.9. The Python.Version settings has been updated so that the parameter is set to value of force_python3, this forces all Python extension points to use Python 3.9 including overriding any application specified settings.
"This is designed to be secure-by-default for new customers. If the value is set to python3.9, the default interpreter is set to Python 3.9 but applications can choose to use a different value. Python 3.7 continues to be available in the build for customers' private apps."
So the problem is a change in the Python version handling when upgrading to Splunk 9.3.0
Either you can try to figure out how to force the app to revert to python 3.7 or you can get a hold of the developers and inform them the app may not be working with 9.3.0 anymore.
Sorry, not fun to have to be the bearer of bad news
Sorru, I jumped the gun a bit there...
This seems to be a problem with the manual execution of house_keeping.py, there is a shebang there which should use the correct version
#!/usr/bin/env python3.7
So that is most likely just a problem with the manual execution of the script.
Then we're back to checking the "permissions to execute scripts". That may still be the issue 🙂
We will continue the restore for now, as our security team is pushing me.
I will check your information when I try another upgrade.
I really appreciate your help! Thank you!
I just have to know, did a rollback solve the issue or does it persist?
We restored a backup.
Splunk is back at version 9.2.2 and everything is working like before.
I've checked the Alert Manager before upgrading and is should be compatible with 9.3.0.
We will give it another try in a few weeks.
Once again, thanks for your help. Much appreciated.
Good to hear your back up and running!
It sure does feel like a breaking change/incompatibility between alert manager and Splunk 9.3. Maybe we'll hold off on updating until 9.3.1 🙂
All the best
Sorry I could not be of more assistance.
If file permissions are not the culprit I still think there might be some issue with the python version handling. Just cant figure ut what that would be though, sorry.
Hm, so it seems that pydantic (or at least pydantic_core) is not available for import. Not sure that house_keeping.py suffers the same problems as tag_keeping.py, but it is a problem for sure.
Did the file permissions look OK?
We have 2 servers, they look the same at the "not upgrade" server.
-rw-r--r-- 1 root root 5743 Jul 30 14:26 alertqueue_consumer.py
-rw-r--r-- 1 root root 10515 Jul 30 14:26 command_ameenrich.py
-rw-r--r-- 1 root root 4633 Jul 30 14:26 command_ameevents.py
-rw-r--r-- 1 root root 11458 Jul 30 14:26 create_alert.py
-rw-r--r-- 1 root root 1207 Jul 30 14:26 _env.py
-rw-r--r-- 1 root root 6027 Jul 30 14:26 handler_license.py
-rw-r--r-- 1 root root 4405 Jul 30 14:26 handler_logging.py
-rw-r--r-- 1 root root 3192 Jul 30 14:26 handler_minit.py
-rw-r--r-- 1 root root 3384 Jul 30 14:26 handler_proxy.py
-rw-r--r-- 1 root root 441 Jul 30 14:26 handler.py
-rw-r--r-- 1 root root 3578 Jul 30 14:26 handler_role_utils.py
-rw-r--r-- 1 root root 5497 Jul 30 14:26 house_keeping.py
-rw-r--r-- 1 root root 273 Jul 30 14:26 __init__.py
-rw-r--r-- 1 root root 3832 Jul 30 14:26 notificationqueue_consumer.py
drwxr-xr-x 2 root root 4096 Jul 30 14:26 persistconn
drwx--x--- 2 root root 4096 Jul 30 14:26 __pycache__
-rw-r--r-- 1 root root 3615 Jul 30 14:26 tag_keeping.py
Hold on a second, there is no permission to execute any of the python scripts, right?
Might be something I'm missing but I suspect that might be a problem at least:
-rwxrw-rw- 1 splunk splunk 5,7K jun 28 11:32 alertqueue_consumer.py
-rwxrw-rw- 1 splunk splunk 11K jun 28 11:32 command_ameenrich.py
-rwxrw-rw- 1 splunk splunk 4,6K jun 28 11:32 command_ameevents.py
-rwxrw-rw- 1 splunk splunk 12K jun 28 11:32 create_alert.py
-rwxrw-rw- 1 splunk splunk 1,2K jun 28 11:32 _env.py
-rwxrw-rw- 1 splunk splunk 5,9K jun 28 11:32 handler_license.py
-rwxrw-rw- 1 splunk splunk 4,4K jun 28 11:32 handler_logging.py
-rwxrw-rw- 1 splunk splunk 3,2K jun 28 11:32 handler_minit.py
-rwxrw-rw- 1 splunk splunk 3,4K jun 28 11:32 handler_proxy.py
-rwxrw-rw- 1 splunk splunk 441 jun 28 11:32 handler.py
-rwxrw-rw- 1 splunk splunk 3,5K jun 28 11:32 handler_role_utils.py
-rwxrw-rw- 1 splunk splunk 5,4K jun 28 11:32 house_keeping.py
-rwxrw-rw- 1 splunk splunk 273 jun 28 11:32 __init__.py
-rwxrw-rw- 1 splunk splunk 3,8K jun 28 11:32 notificationqueue_consumer.py
drwxrwxrwx 2 splunk splunk 4,0K jun 28 11:32 persistconn
-rwxrw-rw- 1 splunk splunk 3,6K jun 28 11:32 tag_keeping.py
So maybe a chmod u+x may solve your problems?
Same problem. But on the other server it has the same permissions and it works without any problems.
-rwxr--r-- 1 root root 5743 Jul 30 14:26 alertqueue_consumer.py
-rwxr--r-- 1 root root 10515 Jul 30 14:26 command_ameenrich.py
-rwxr--r-- 1 root root 4633 Jul 30 14:26 command_ameevents.py
-rwxr--r-- 1 root root 11458 Jul 30 14:26 create_alert.py
-rwxr--r-- 1 root root 1207 Jul 30 14:26 _env.py
-rwxr--r-- 1 root root 6027 Jul 30 14:26 handler_license.py
-rwxr--r-- 1 root root 4405 Jul 30 14:26 handler_logging.py
-rwxr--r-- 1 root root 3192 Jul 30 14:26 handler_minit.py
-rwxr--r-- 1 root root 3384 Jul 30 14:26 handler_proxy.py
-rwxr--r-- 1 root root 441 Jul 30 14:26 handler.py
-rwxr--r-- 1 root root 3578 Jul 30 14:26 handler_role_utils.py
-rwxr--r-- 1 root root 5497 Jul 30 14:26 house_keeping.py
-rwxr--r-- 1 root root 273 Jul 30 14:26 __init__.py
-rwxr--r-- 1 root root 3832 Jul 30 14:26 notificationqueue_consumer.py
drwxr-xr-x 2 root root 4096 Jul 30 14:26 persistconn
drwx--x--- 2 root root 4096 Jul 30 14:26 __pycache__
-rwxr--r-- 1 root root 3615 Jul 30 14:26 tag_keeping.py
We will restore a backup as we want the system to be up again. I will try an upgrade in a few weeks. If anything changes, I will update the post.
Thank you for your help!
OK, assuming you are running Splunk as root that seems to check out.
Splunk python cannot itself import pydantic, though it seems to be bundled with the app so maybe that is just how the command is executed (which is pretty bad, but still). I see a bunch of "from pathlib import Path" so I'm assuming this is to import from localy bundled versions.
Not sure what the issue is then, I'll see if I can recreate the error on a local box
The seems to be a python script "alert_manager_enterprise/bin/tag_keeping.py" which fails to run/execute causing the error.
The script offers a way to test output through: /opt/splunk/bin/splunk cmd python3 house_keeping.py --scheme
I don't use this app hence this is more "in general" and "best guess", initially I would check that the script has the correct permissions set to be run.
Then (assuming it is safe to run the "test command" above) see if you can manually execute the command.
It's possible that the update caused some problem with permissions for script execution, or (have not checked) there was an update to the python version which now is incompatible with the script bundled with the alert manager app (probably less likely though).
root@dc-splunk01://opt/splunk/etc/apps/alert_manager_enterprise/bin# /opt/splunk/bin/splunk cmd python3 house_keeping.py --scheme
Traceback (most recent call last):
File "/opt/splunk/etc/apps/alert_manager_enterprise/bin/house_keeping.py", line 21, in <module>
from datapunctum.factory_logger import Logger
File "/opt/splunk/etc/apps/alert_manager_enterprise/bin/../lib/datapunctum/factory_logger.py", line 10, in <module>
from pydantic import BaseModel
File "/opt/splunk/etc/apps/alert_manager_enterprise/bin/../lib/pydantic/__init__.py", line 372, in __getattr__
module = import_module(module_name, package=package)
File "/opt/splunk/lib/python3.9/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "/opt/splunk/etc/apps/alert_manager_enterprise/bin/../lib/pydantic/main.py", line 11, in <module>
import pydantic_core
File "/opt/splunk/etc/apps/alert_manager_enterprise/bin/../lib/pydantic_core/__init__.py", line 6, in <module>
from ._pydantic_core import (
ModuleNotFoundError: No module named 'pydantic_core._pydantic_core'
New install of the app gave the same result.