I want to run a daily alert to check for outliers in host crashes via the MLTK time series forecast algorithm; however, the syntax is not optimal for forecasting multiple hosts, so I have an initial filter which shows the list of hosts if the amount of crashes is higher than average. I want to take this output and then, for each host in the list, run the outlier detection as follows:
| timechart span=1d sum(VOLUME)
| predict "sum(VOLUME)" as prediction algorithm="LLP5" future_timespan="30" holdback="14" period=7 lower"95"=lower"95" upper"95"=upper"95"
| eval isOutlier = if(prediction!="" AND 'sum(VOLUME)' !="" AND ('sum(VOLUME)' < 'lower95(prediction)' OR 'sum(VOLUME)' > 'upper95(prediction)'), 1, 0)
| where isOutlier=1
| fields - isOutlier
But I'm not sure the best way to go about this. I know I can output the results from my initial filtering search to a lookup and then have separate queries that say "for the host from row 1, run outlier detection," and then "for host from row 2, run outlier detection," etc. but this would require separate alert queries for however many rows I would want to include. What I would really like is a query that iterates through the results of my initial filter, and then for each row, grab the host and run the outlier detection. Is there a way to run a loop like this?
Hi @TylerJVitale,
So far this works:
| makeresults
| eval hosts_predict=split("host1,host2,host3,host4,host5", ",")
| mvexpand hosts_predict
| map maxsearches=5 search="search index=\"index_to_search_in\" latest=\"-0d@d\" host=\"$hosts_predict$\" | table _time host VOLUME | bin _time span=1d | stats sum(VOLUME) as sum_VOLUME by _time host | predict sum_VOLUME as prediction algorithm=\"LLP5\" future_timespan=\"30\" holdback=\"14\" period=7 lower\"95\"=lower\"95\" upper\"95\"=upper\"95\" | filldown host"
| eval isOutlier=if(sum_VOLUME < 'lower95(prediction)' OR sum_VOLUME > 'upper95(prediction)', 1, 0)
| where isOutlier=1
| fields - isOutlier
A little explaining:
I agree with @grana_splunk that is highly recommended to evaluate another way to accomplish the outlier detection logic. Here I present you with several alternatives:
To replace the makeresults you could do the following:
index=\"index_to_search_in\"
| table host
| dedup host
| rename host as hosts_predict
| map ...
With a lookup:
| inputlookup list_of_hosts.csv
| field host
| rename host as hosts_predict
| map ...
Hope it helps
Hi @TylerJVitale,
So far this works:
| makeresults
| eval hosts_predict=split("host1,host2,host3,host4,host5", ",")
| mvexpand hosts_predict
| map maxsearches=5 search="search index=\"index_to_search_in\" latest=\"-0d@d\" host=\"$hosts_predict$\" | table _time host VOLUME | bin _time span=1d | stats sum(VOLUME) as sum_VOLUME by _time host | predict sum_VOLUME as prediction algorithm=\"LLP5\" future_timespan=\"30\" holdback=\"14\" period=7 lower\"95\"=lower\"95\" upper\"95\"=upper\"95\" | filldown host"
| eval isOutlier=if(sum_VOLUME < 'lower95(prediction)' OR sum_VOLUME > 'upper95(prediction)', 1, 0)
| where isOutlier=1
| fields - isOutlier
A little explaining:
I agree with @grana_splunk that is highly recommended to evaluate another way to accomplish the outlier detection logic. Here I present you with several alternatives:
To replace the makeresults you could do the following:
index=\"index_to_search_in\"
| table host
| dedup host
| rename host as hosts_predict
| map ...
With a lookup:
| inputlookup list_of_hosts.csv
| field host
| rename host as hosts_predict
| map ...
Hope it helps
This might work. I would have loved to use the DensityFunction but we don't have the MLTK 4.2, and IQR or StandardDeviation won't work because they can't filter seasonality and trend.
Few questions:
Thanks,
Tyler
I will edit the answer so it reflects how you could replace the makeresults with a lookup or a search.
Also, the predict command requires a preceding timechart, at least in my version of the MLTK. And then with timechart it gets all messy if you try to predict by host
You could replace the stats with chart or with timechart before the predict command specifying the span=1d as follows:
With chart:
...
| map maxsearches=5 search="search index=\"index_to_search_in\" latest=\"-0d@d\" host=\"$hosts_predict$\" | table _time host VOLUME | bin _time span=1d | chart sum(VOLUME) as sum_VOLUME last(host) as host by _time | predict sum_VOLUME as prediction algorithm=\"LLP5\" future_timespan=\"30\" holdback=\"14\" period=7 lower\"95\"=lower\"95\" upper\"95\"=upper\"95\" | filldown host"
...
OR timechart:
...
| map maxsearches=5 search="search index=\"index_to_search_in\" latest=\"-0d@d\" host=\"$hosts_predict$\" | table _time host VOLUME | timechart sum(VOLUME) as sum_VOLUME last(host) as host span=1d | predict sum_VOLUME as prediction algorithm=\"LLP5\" future_timespan=\"30\" holdback=\"14\" period=7 lower\"95\"=lower\"95\" upper\"95\"=upper\"95\" | filldown host"
...
Confidence interval has nothing to do with being an outlier or not.Please do not use forecasting for finding your outliers. I would suggest you to go through this blog and look into the new algorithm we have in MLTK: https://www.splunk.com/blog/2019/03/20/what-s-new-in-the-splunk-machine-learning-toolkit-4-2.html
The forecasting algorithm in MLTK has an outlier panel, so why shouldn't I use it? It does exactly what I want it to, creating a model that accounts for seasonality and trend and then constructing a CI around that. If the number of crashes falls outside that CI, I would like to be alerted. Why is this not okay?
As for the new MLTK, we're not up to date on it and I'm not sure if/when we will upgrade, so this will have to do for now
Hi @TylerJVitale . Have you tried the map command? Although not sure if its optimal to use it in conjunction with the predict command.
https://docs.splunk.com/Documentation/Splunk/7.3.0/SearchReference/Map
This seems like it could work. I'm just having difficulty figuring out how to configure it. At the end of my initial query, I have a table with host avg VOLUME. I want to run the timechart and prediction for each host, but even just tacking on something like |map search="search index=index sourcetype="sourcetype" host="$host$"
gives me no results, so I'm not sure where the issue is or how to fix it. My best guess is it's something with the search ID field.
| timechart span=1h sum(VOLUME)"
Ok, I will try to make a test and try to have an answer; meanwhile, for outlier detection you could read the following:
https://docs.splunk.com/Documentation/Splunk/7.3.0/Search/Findingandremovingoutliers