Archive
Highlighted

Deploy model from mltk toolkit experiment to predict next 5 values

Contributor

Hello Colleagues,

I created an experiment to predict the numerical values and have a model generated / published.
Sorry for the naive question ... but how would predict like 5 next values for my CPU consumption with this model?
I can use the apply command, fine, but it gives back only the current and prediction to this current, not the future.
How would I predict the future values?

At the moment I have:

| mstats avg(_value) as value where `nmon_metrics_index` (metric_name=os.unix.nmon.cpu.cpu_all.Sys_PCT OR metric_name=os.unix.nmon.cpu.cpu_all.User_PCT OR metric_name=os.unix.nmon.cpu.cpu_all.Wait_PCT) host=spwdfvml0957* groupby metric_name, OStype, host span=1m
| `def_cpu_load_percent` | apply "pgt_test_cpu_prediction_model"

But as mentioned it produces a table with the cpuloadpercent and and predicted(cpuloadpercent). How would I get predicted in the future?

Kind Regards,
Kamil

Tags (2)
0 Karma
Highlighted

Re: Deploy model from mltk toolkit experiment to predict next 5 values

Champion

hi @damucka - I understand what you mean. Which algo are you using? The thing to understand here is when you say 'predict next 5 values' that is possible only with time series models. You are probably using a regression algo (random forest?) , using this you can not (and it is not intended as well) predict the next set of values. To give an example and from a working model that I use in my project currently. I have a good cpu prediction model, the fields used to predict cpu are time(in 24 hour clock) and the day of the week (1-7), now how do I perform a prediction with it? Ther are 2 ways to do so.
Have the user input the time of the day and the day of the week, apply your model and the result will be the predicted cpu for that specific day at the hour of the day.
Another way to do this is ti give a user a default view with all the days and hours (like what you are currently receiving from the table with all predicted values) and then ask the user to choose the day /time in a dashboard dropdown.
What I have done is to give this default table(in a area viz) and enable only the day of the week drop down as a user input. Now, when the user chooses a day(say, Monday) , I output the predicted cpu values for all 24 hours in Monday.This gives the users a choice on when to perfrom activities that can lead to system load. We have been using this model to plan business campaigns , which generally stresses out the cpu. Capaign managers can now go and select a sepcific day and see the predicted cpu values across a 24 hour time format for the day and plan their campaigns at a time where predicted cpu is less and which also gives the business sufficient time to adjust to the functional campaign demands.
If you are using time series forecasting (using the splunk 'predict' command) you can use the future_span entity to predict the enxt 5 values.
So, the model algorithm you are using defines your prediction outcomes...

0 Karma