Can anyone list the hardware requirements and dependencies associated with using the Machine Learning Toolkit?
I think it's awesome and want to start using it.
For example, do you need a dedicated search head? What is the ideal configuration? Do the models create volumes and take up space? I have a complex Core and Splunk Enterprise Security deployment and I want to know how installing and using the MLTK will affect storage, licensing(indexing/day), performance, etc.
Can anyone point me to the documentation related to these concerns?
Thank you for the link.
I have looked thru this again and I don't see anything addressing my concerns, maybe there is no issue with the MLTK or the /models sub folder etc...
May be you can check our Machine learning Performance App i.e. https://splunkbase.splunk.com/app/3289/
Using the dashboard in this app, you can browse the results of performance testing of ML-SPL. For each algorithm implemented in ML-SPL, we measure running time, CPU utilization, memory utilization, and disk activity when fitting models on up to 1,000,000 search results, and applying models on up to 10,000,000 search results, each with up to 50 fields.
Thank you for the suggestion. Can you share any anonymous information with me in regards to space requirements you have seen when fitting an algorithm to your data and applying a model to new data? Did you have to significantly increase storage capacity? I am relatively new to the admin side of splunk, and was looking to predict if we need more resources, and did not want to impact what's currently running.
Model size varies with different algorithm but you can control the size of model by configuring it in mlspl.conf