All Apps and Add-ons

Alert manager enterprise wont start after Splunk update to V9.3.0

Glennuss
Explorer

Hi all,
Today I've updated Splunk from version 9.2.2 to 9.3.0.
All seems to be good except Alert Manager Enterprise 3.0.8 ( is not working anymore.)

I'm kinda new into splunk, so I don't know where to start.
The error we got is: 
Unable to initialize modular input "tag_keeping" defined in the app "alert_manager_enterprise":
Introspecting scheme=tag_keeping: script running failed (PID 4085525 exited with code1)

Please help me 🙂

Kind regards,

Glenn

Labels (1)
0 Karma

fatsug
Contributor

So I think I figured it out.

/opt/splunk/bin/splunk cmd python3 house_keeping.py --scheme
Traceback (most recent call last):
File "/opt/splunk/etc/apps/alert_manager_enterprise/bin/house_keeping.py", line 21, in <module>
from datapunctum.factory_logger import Logger
File "/opt/splunk/etc/apps/alert_manager_enterprise/bin/../lib/datapunctum/factory_logger.py", line 10, in <module>
from pydantic import BaseModel
File "/opt/splunk/etc/apps/alert_manager_enterprise/bin/../lib/pydantic/__init__.py", line 372, in __getattr__
module = import_module(module_name, package=package)
File "/opt/splunk/lib/python3.9/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "/opt/splunk/etc/apps/alert_manager_enterprise/bin/../lib/pydantic/main.py", line 11, in <module>
import pydantic_core
File "/opt/splunk/etc/apps/alert_manager_enterprise/bin/../lib/pydantic_core/__init__.py", line 6, in <module>
from ._pydantic_core import (
ModuleNotFoundError: No module named 'pydantic_core._pydantic_core'

However

/opt/splunk/bin/splunk cmd python3.7 house_keeping.py --scheme
<scheme><title>AME house keeping</title><description>This task does everything, that has to be done in the background, like mapping users and checking the events time to live.</description><use_external_validation>true</use_external_validation><use_single_instance>false</use_single_instance><streaming_mode>xml</streaming_mode><endpoint><args><arg name="default"><description>Unused Default Argument</description><data_type>string</data_type><required_on_edit>false</required_on_edit><required_on_create>true</required_on_create></arg></args></endpoint></scheme>splunk@gdovm:~/etc/apps/alert_manager_enterprise/bin$

And then there is this: https://docs.splunk.com/Documentation/Splunk/9.3.0/ReleaseNotes/MeetSplunk

n this release, the default Python interpreter is set to Python version 3.9. The Python.Version settings has been updated so that the parameter is set to value of force_python3, this forces all Python extension points to use Python 3.9 including overriding any application specified settings.

"This is designed to be secure-by-default for new customers. If the value is set to python3.9, the default interpreter is set to Python 3.9 but applications can choose to use a different value. Python 3.7 continues to be available in the build for customers' private apps."

So the problem is a change in the Python version handling when upgrading to Splunk 9.3.0

Either you can try to figure out how to force the app to revert to python 3.7 or you can get a hold of the developers and inform them the app may not be working with 9.3.0 anymore.

Sorry, not fun to have to be the bearer of bad news

fatsug
Contributor

Sorru, I jumped the gun a bit there...

This seems to be a problem with the manual execution of house_keeping.py, there is a shebang there which should use the correct version

#!/usr/bin/env python3.7

So that is most likely just a problem with the manual execution of the script.

Then we're back to checking the "permissions to execute scripts". That may still be the issue 🙂

Glennuss
Explorer

We will continue the restore for now, as our security team is pushing me.
I will check your information when I try another upgrade.

I really appreciate your help! Thank you!

0 Karma

fatsug
Contributor

I just have to know, did a rollback solve the issue or does it persist? 

0 Karma

Glennuss
Explorer

We restored a backup.

Splunk is back at version 9.2.2 and everything is working like before.
I've checked the Alert Manager before upgrading and is should be compatible with 9.3.0.
We will give it another try in a few weeks.

Once again, thanks for your help. Much appreciated.

0 Karma

fatsug
Contributor

Good to hear your back up and running!

It sure does feel like a breaking change/incompatibility between alert manager and Splunk 9.3. Maybe we'll hold off on updating until 9.3.1 🙂

All the best

0 Karma

fatsug
Contributor

Sorry I could not be of more assistance.

If file permissions are not the culprit I still think there might be some issue with the python version handling. Just cant figure ut what that would be though, sorry.

0 Karma

fatsug
Contributor

Hm, so it seems that pydantic (or at least pydantic_core) is not available for import. Not sure that house_keeping.py suffers the same problems as tag_keeping.py, but it is a problem for sure.

Did the file permissions look OK?

0 Karma

Glennuss
Explorer

We have 2 servers, they look the same at the "not upgrade" server.

-rw-r--r-- 1 root root 5743 Jul 30 14:26 alertqueue_consumer.py
-rw-r--r-- 1 root root 10515 Jul 30 14:26 command_ameenrich.py
-rw-r--r-- 1 root root 4633 Jul 30 14:26 command_ameevents.py
-rw-r--r-- 1 root root 11458 Jul 30 14:26 create_alert.py
-rw-r--r-- 1 root root 1207 Jul 30 14:26 _env.py
-rw-r--r-- 1 root root 6027 Jul 30 14:26 handler_license.py
-rw-r--r-- 1 root root 4405 Jul 30 14:26 handler_logging.py
-rw-r--r-- 1 root root 3192 Jul 30 14:26 handler_minit.py
-rw-r--r-- 1 root root 3384 Jul 30 14:26 handler_proxy.py
-rw-r--r-- 1 root root 441 Jul 30 14:26 handler.py
-rw-r--r-- 1 root root 3578 Jul 30 14:26 handler_role_utils.py
-rw-r--r-- 1 root root 5497 Jul 30 14:26 house_keeping.py
-rw-r--r-- 1 root root 273 Jul 30 14:26 __init__.py
-rw-r--r-- 1 root root 3832 Jul 30 14:26 notificationqueue_consumer.py
drwxr-xr-x 2 root root 4096 Jul 30 14:26 persistconn
drwx--x--- 2 root root 4096 Jul 30 14:26 __pycache__
-rw-r--r-- 1 root root 3615 Jul 30 14:26 tag_keeping.py

0 Karma

fatsug
Contributor

Hold on a second, there is no permission to execute any of the python scripts, right?

Might be something I'm missing but I suspect that might be a problem at least:

-rwxrw-rw- 1 splunk splunk 5,7K jun 28 11:32 alertqueue_consumer.py
-rwxrw-rw- 1 splunk splunk 11K jun 28 11:32 command_ameenrich.py
-rwxrw-rw- 1 splunk splunk 4,6K jun 28 11:32 command_ameevents.py
-rwxrw-rw- 1 splunk splunk 12K jun 28 11:32 create_alert.py
-rwxrw-rw- 1 splunk splunk 1,2K jun 28 11:32 _env.py
-rwxrw-rw- 1 splunk splunk 5,9K jun 28 11:32 handler_license.py
-rwxrw-rw- 1 splunk splunk 4,4K jun 28 11:32 handler_logging.py
-rwxrw-rw- 1 splunk splunk 3,2K jun 28 11:32 handler_minit.py
-rwxrw-rw- 1 splunk splunk 3,4K jun 28 11:32 handler_proxy.py
-rwxrw-rw- 1 splunk splunk 441 jun 28 11:32 handler.py
-rwxrw-rw- 1 splunk splunk 3,5K jun 28 11:32 handler_role_utils.py
-rwxrw-rw- 1 splunk splunk 5,4K jun 28 11:32 house_keeping.py
-rwxrw-rw- 1 splunk splunk 273 jun 28 11:32 __init__.py
-rwxrw-rw- 1 splunk splunk 3,8K jun 28 11:32 notificationqueue_consumer.py
drwxrwxrwx 2 splunk splunk 4,0K jun 28 11:32 persistconn
-rwxrw-rw- 1 splunk splunk 3,6K jun 28 11:32 tag_keeping.py

 So maybe a chmod u+x may solve your problems?

0 Karma

Glennuss
Explorer

Same problem. But on the other server it has the same permissions and it works without any problems.

-rwxr--r-- 1 root root  5743 Jul 30 14:26 alertqueue_consumer.py
-rwxr--r-- 1 root root 10515 Jul 30 14:26 command_ameenrich.py
-rwxr--r-- 1 root root  4633 Jul 30 14:26 command_ameevents.py
-rwxr--r-- 1 root root 11458 Jul 30 14:26 create_alert.py
-rwxr--r-- 1 root root  1207 Jul 30 14:26 _env.py
-rwxr--r-- 1 root root  6027 Jul 30 14:26 handler_license.py
-rwxr--r-- 1 root root  4405 Jul 30 14:26 handler_logging.py
-rwxr--r-- 1 root root  3192 Jul 30 14:26 handler_minit.py
-rwxr--r-- 1 root root  3384 Jul 30 14:26 handler_proxy.py
-rwxr--r-- 1 root root   441 Jul 30 14:26 handler.py
-rwxr--r-- 1 root root  3578 Jul 30 14:26 handler_role_utils.py
-rwxr--r-- 1 root root  5497 Jul 30 14:26 house_keeping.py
-rwxr--r-- 1 root root   273 Jul 30 14:26 __init__.py
-rwxr--r-- 1 root root  3832 Jul 30 14:26 notificationqueue_consumer.py
drwxr-xr-x 2 root root  4096 Jul 30 14:26 persistconn
drwx--x--- 2 root root  4096 Jul 30 14:26 __pycache__
-rwxr--r-- 1 root root  3615 Jul 30 14:26 tag_keeping.py

We will restore a backup as we want the system to be up again. I will try an upgrade in a few weeks. If anything changes, I will update the post.
Thank you for your help!

0 Karma

fatsug
Contributor

OK, assuming you are running Splunk as root that seems to check out.

Splunk python cannot itself import pydantic, though it seems to be bundled with the app so maybe that is just how the command is executed (which is pretty bad, but still). I see a bunch of "from pathlib import Path" so I'm assuming this is to import from localy bundled versions.

Not sure what the issue is then, I'll see if I can recreate the error on a local box

 

0 Karma

fatsug
Contributor

The seems to be a python script "alert_manager_enterprise/bin/tag_keeping.py" which fails to run/execute causing the error.

The script offers a way to test output through: /opt/splunk/bin/splunk cmd python3 house_keeping.py --scheme

I don't use this app hence this is more "in general" and "best guess", initially I would check that the script has the correct permissions set to be run.

Then (assuming it is safe to run the "test command" above) see if you can manually execute the command.

It's possible that the update caused some problem with permissions for script execution, or (have not checked) there was an update to the python version which now is incompatible with the script bundled with the alert manager app (probably less likely though).

 

0 Karma

Glennuss
Explorer

root@dc-splunk01://opt/splunk/etc/apps/alert_manager_enterprise/bin# /opt/splunk/bin/splunk cmd python3 house_keeping.py --scheme
Traceback (most recent call last):
File "/opt/splunk/etc/apps/alert_manager_enterprise/bin/house_keeping.py", line 21, in <module>
from datapunctum.factory_logger import Logger
File "/opt/splunk/etc/apps/alert_manager_enterprise/bin/../lib/datapunctum/factory_logger.py", line 10, in <module>
from pydantic import BaseModel
File "/opt/splunk/etc/apps/alert_manager_enterprise/bin/../lib/pydantic/__init__.py", line 372, in __getattr__
module = import_module(module_name, package=package)
File "/opt/splunk/lib/python3.9/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "/opt/splunk/etc/apps/alert_manager_enterprise/bin/../lib/pydantic/main.py", line 11, in <module>
import pydantic_core
File "/opt/splunk/etc/apps/alert_manager_enterprise/bin/../lib/pydantic_core/__init__.py", line 6, in <module>
from ._pydantic_core import (
ModuleNotFoundError: No module named 'pydantic_core._pydantic_core'

New install of the app gave the same result.

0 Karma
Get Updates on the Splunk Community!

Introducing New Splunkbase Governance!

Splunk apps are essential for maximizing the value of your Splunk Experience. Whether you’re using the default ...

Splunk Edge Processor | Popular Use Cases to Get Started with Edge Processor

Splunk Edge Processor offers more efficient, flexible data transformation – helping you reduce noise, control ...

3 Ways to Make OpenTelemetry Even Better

My role as an Observability Specialist at Splunk provides me with the opportunity to work with customers of ...