Hi all,
We are trying to deploy pre trained Deep Learning models for ESCU. DSDL has been installed and container are loaded successfully. Connection with docker is also in good shape.
But when running the ESCU search, I am getting the following error messages.
MLTKC error: /apply: ERROR: unable to initialize module. Ended with exception: No module named 'keras_preprocessing'
MLTKC parameters: {'params': {'mode': 'stage', 'algo': 'pretrained_dga_model_dsdl'}, 'args': ['is_dga', 'domain'], 'target_variable': ['is_dga'], 'feature_variables': ['domain'], 'model_name': 'pretrained_dga_model_dsdl', 'algo_name': 'MLTKContainer', 'mlspl_limits': {'handle_new_cat': 'default', 'max_distinct_cat_values': '100', 'max_distinct_cat_values_for_classifiers': '100', 'max_distinct_cat_values_for_scoring': '100', 'max_fit_time': '600', 'max_inputs': '100000', 'max_memory_usage_mb': '4000', 'max_model_size_mb': '30', 'max_score_time': '600', 'use_sampling': 'true'}, 'kfold_cv': None, 'dispatch_dir': '/opt/splunk/var/run/splunk/dispatch/1704812182.86156_AC9C076F-2C37-4E94-9DD0-0AE04AEB7952'}
From search.log
01-09-2024 09:56:44.725 INFO ChunkedExternProcessor [47063 ChunkedExternProcessorStderrLogger] - stderr: MLTKC endpoint: https://docker_host:32802
01-09-2024 09:56:44.850 INFO ChunkedExternProcessor [47063 ChunkedExternProcessorStderrLogger] - stderr: POST endpoint [https://docker_host:32802/apply] called with payload (2298991 bytes)
01-09-2024 09:56:45.166 INFO ChunkedExternProcessor [47063 ChunkedExternProcessorStderrLogger] - stderr: POST endpoint [https://docker_host:32802/apply] returned with payload (134 bytes) with status 200
01-09-2024 09:56:45.166 ERROR ChunkedExternProcessor [47063 ChunkedExternProcessorStderrLogger] - stderr: MLTKC error: /apply: ERROR: unable to initialize module. Ended with exception: No module named 'keras_preprocessing'
01-09-2024 09:56:45.167 ERROR ChunkedExternProcessor [47063 ChunkedExternProcessorStderrLogger] - stderr: MLTKC parameters: {'params': {'mode': 'stage', 'algo': 'pretrained_dga_model_dsdl'}, 'args': ['is_dga', 'domain'], 'target_variable': ['is_dga'], 'feature_variables': ['domain'], 'model_name': 'pretrained_dga_model_dsdl', 'algo_name': 'MLTKContainer', 'mlspl_limits': {'handle_new_cat': 'default', 'max_distinct_cat_values': '100', 'max_distinct_cat_values_for_classifiers': '100', 'max_distinct_cat_values_for_scoring': '100', 'max_fit_time': '600', 'max_inputs': '100000', 'max_memory_usage_mb': '4000', 'max_model_size_mb': '30', 'max_score_time': '600', 'use_sampling': 'true'}, 'kfold_cv': None, 'dispatch_dir': '/opt/splunk/var/run/splunk/dispatch/1704812182.86156_AC9C076F-2C37-4E94-9DD0-0AE04AEB7952'}
01-09-2024 09:56:45.167 ERROR ChunkedExternProcessor [47063 ChunkedExternProcessorStderrLogger] - stderr: apply ended with options {'params': {'mode': 'stage', 'algo': 'pretrained_dga_model_dsdl'}, 'args': ['is_dga', 'domain'], 'target_variable': ['is_dga'], 'feature_variables': ['domain'], 'model_name': 'pretrained_dga_model_dsdl', 'algo_name': 'MLTKContainer', 'mlspl_limits': {'handle_new_cat': 'default', 'max_distinct_cat_values': '100', 'max_distinct_cat_values_for_classifiers': '100', 'max_distinct_cat_values_for_scoring': '100', 'max_fit_time': '600', 'max_inputs': '100000', 'max_memory_usage_mb': '4000', 'max_model_size_mb': '30', 'max_score_time': '600', 'use_sampling': 'true'}, 'kfold_cv': None, 'dispatch_dir': '/opt/splunk/var/run/splunk/dispatch/1704812182.86156_AC9C076F-2C37-4E94-9DD0-0AE04AEB7952'}
Has anyone run into this before?
We have Golden Image CPU running .
Following shows up in container logs.
Thanks
Hey there,
My guess is that you built a custom model in the jupyter lab environment where you also installed the keras package and imported the preprocessing functionality (from keras import preprocessing). You can probably fit and apply your model just fine inside the jupyter lab. The problem is, however, that once you want to fit and apply the model over in Splunk SPL, you get the error you described.
Before you do anything else, make sure that the keras package is imported in the correct cell of your jupyter notebook. The package MUST be imported in the correct cell. If you started from the barebone_template.ipynb, this is the cell:
You can check if you import packages in the correct cell by navigating to /app/model/your_notebook.py and check whether the keras package is imported. This is the file that will be used once you issue the | fit or | apply command over in Splunk. Here an example of how the .py file from the barebone_template.ipynb looks like:
If this did resolve your issue, great. If not, keep on reading.
The docker container image you use, must have the keras library installed, otherwise the library is not available through the | fit and | apply command in Splunk SPL.
Try resolving your issue by either ...
... using the pre-built 'Transformers CPU (5.1.1)' container image
... using the pre-built 'Transformers GPU (5.1.1)' container image
... build your own docker image as described here
Make sure to update your DSDL app to the latest version in order to have these pre-built container images available.
Let me know if I can help you any further.