Machine Learning Toolkit - Density Function
Hello,
I'm trying to use the machine learning tool in order to create a model based on a time frame and then analyze the current time window in order to see if it stands out from an average behavior fort his hour of the day on this day of the week.
When trying to build the model with a command like this:
index=onlineservices "USERACTION: Successful login" earliest=-33d@d latest=-1d@d
| bin _time span=15m
| eval date_minutebin=strftime(_time, "%M")
| eval date_hour=strftime(_time, "%H")
| eval date_wday=strftime(_time, "%A")
| stats count by _time date_minutebin date_hour date_wday
| fit DensityFunction count by "date_minutebin,date_hour,date_wday"
into mydensitymodel threshold=0.5 dist=norm sample=True
I get the following error:
Error in 'fit' command: Error while fitting "DensityFunction" model: 'module' object has no attribute 'wasserstein_distance'
I have also tryed to change the metrics to kolmogorov_smirnov, but I get another error in this case:
Error in 'fit' command: Error while fitting "DensityFunction" model: 'functools.partial' object has no attribute '__module__'
Can someone help me with this? Why the default metrics does not work in my case?
Bellow you ca find an example of the data I get from the first search, before piping into FIT function:
_time date_minutebin date_hour date_wday count
2021-04-29 00:00:00 00 00 Thursday 4
2021-04-29 00:15:00 15 00 Thursday 3
2021-04-29 00:30:00 30 00 Thursday 2
2021-04-29 00:45:00 45 00 Thursday 3
2021-04-29 01:00:00 00 01 Thursday 2
2021-04-29 01:15:00 15 01 Thursday 1
2021-04-29 01:45:00 45 01 Thursday 2
2021-04-29 02:00:00 00 02 Thursday 3
2021-04-29 02:15:00 15 02 Thursday 1
2021-04-29 02:30:00 30 02 Thursday 2
2021-04-29 02:45:00 45 02 Thursday 1
2021-04-29 03:00:00 00 03 Thursday 1
2021-04-29 03:15:00 15 03 Thursday 2
2021-04-29 03:30:00 30 03 Thursday 1
2021-04-29 03:45:00 45 03 Thursday 2
2021-04-29 04:00:00 00 04 Thursday 1
2021-04-29 04:15:00 15 04 Thursday 2
2021-04-29 04:30:00 30 04 Thursday 1
Using your sample data, I receive no errors (just training size warnings) with the following baseline:
Splunk Enterprise 8.1.1
Python for Scientific Computing 2.0.1
Splunk Machine Learning Toolkit 5.2.0
Is your training data clean?
Do you have the correct combination of Splunk Enterprise, MLTK, and Python for Scientific Computing installed? See https://docs.splunk.com/Documentation/MLApp/5.2.1/User/MLTKversiondepends. Note that Splunk Enterprise Security has separate requirements.
Thank you for the suggestion.
We have splunk version 7.2.8
Machine learning 4.5 - the newer we could install on 7.2.8.
We will consider to upgrade the versions or to check the combination of Splunk Enterprise, MLTK, and Python
@tscroggins: Thank you. It works now. We have updated to PSC 1.4 - density fit is supported form this version on.