I am storing my customer's devices logs in my index.
each customer has many devices and each device has a file path.
I have last 30 days of data .
IN splunk Machine Learning tookit i used PREDICT NUMERIC FIEDS and RandomforestRegressor gave me the best R2 value as 0.71.
now i want to predict for future
How to do it?
Please help me.
You will need to apply your model after fitting it. Once your model is applied you can
predict into the future. Your R2 is a little low, I would suggest adding additional explanatory fields to increase that R2 value. You could use
correlate or the patterns tab to help identify other fields which may help in increasing your target values accuracy
Thank you very much!!
But first i am new to splunk and then i am new to machine learning.
I think I am getting you but i am not getting you.
I wish you could do and show this to me ! but I dont know i can ask that or not.
Anyway Thank you !
The machine learning app is more advanced and I think it would be best if you got an understanding of using Splunk before diving into that.
Forget about the ML app and play with the
predict command for now
yes sir I played with it but i can predict for a whole customer but not for per customer per device per filesystem , i am unable to break the prediction per customer per device per filesystem.
Let me give u an example of data
cus_name device_name idx_label disk_used
Alex pixel /var 356216
Alex pixel /var/log 2576
Alex pixel /home 4567
Tom apple /var 7656
Tom apple / 71928
Mary Note8 /var/log/audit 69897
Mary Note8 /var 98709
Like this each Customer has Large number of devices and each device has different filesystem the data is getting written. This is the log data that is coming into our indexers and we r storing .
so my team want to predict for each customer per device per filesystem.
all i am getting is predicting the avg(d_used) for future fitting an algorithm and predicting it for future .
index=cus_data splunk_server=CustomerData originalsourcetype=rawData | bin _time span=1d
|table _time, cust_name, device_name, idx_label, d_used, d_used_percent | fit RandomForestRegressor "d_used" from "_time" "_cust_name" "_device_name" "idx_label" into "device_prediction_randomforest" | table _time,"d_used","predicted(d_used)" | rename predicted(d_used) as Dused | timechart span=1d avg(Dused) | predict "avg(Dused)" as prediction algorithm="LLP5" future_timespan=3"
I want it per customer per device_name per idx_label
Thanks in advance!!