Dashboards & Visualizations

Is there a way to specify a different time range for predict?

kiamco
Path Finder

I know that the predict functions become more accurate when you feed it more data but I don't want to be querying 2 months worth of data in a dashboard that would take like 2 mins to load. Is there a way to get a more accurate prediction without actively querying the past 2 months? or is there a way to do this differently with a different function. FYI I d not have authority to download the MLTK

I know this is a tough question but would like to hear some ideas.

index=summary source="summary_events_2" 
orig_source=/var/log/pnr*
ms_region=us-west-1
ms_level=E*
| timechart span=15m  sum(count) as count 
| predict count as count_prediction period=7 algorithm=LLP5 future_timespan=10 holdback=0 upper50=high_prediction lower5=low_prediction
| rename high_prediction(count_prediction) as high_prediction
| eval deviation=count-round(count_prediction,0)
| streamstats window=300 current=true median(deviation) as median_of_residual
| eval abs_dev=(abs(deviation - median_of_residual))
| streamstats window=300 current=true median(abs_dev) as median_abs_dev
| eval upper_bound=if(median_of_residual + median_abs_dev * 5 < 0,abs(median_of_residual + median_abs_dev), median_of_residual + median_abs_dev * 5) 
| eval anomaly=if(deviation > upper_bound,1,0)
| predict deviation as deviation_prediction period=7 algorithm=LLP5 future_timespan=0 holdback=0 upper20=high_prediction lower20=low_prediction
| fields -  median_of_residual, median_abs_dev, abs_dev, high_prediction, bounds, count, count_prediction
0 Karma
1 Solution

woodcock
Esteemed Legend

I agree with @DalJeanis. In particular, if this is the only search like this, report acceleration is the easiest and best option for you. If you could use MLTK, you could do a one-time learning over a huge time span and true this up periodically, but that's out. Also, check out this INCREDIBLE answer by @mmodestino here:

https://answers.splunk.com/answers/511894/how-to-use-the-timewrap-command-and-set-an-alert-f.html

View solution in original post

woodcock
Esteemed Legend

I agree with @DalJeanis. In particular, if this is the only search like this, report acceleration is the easiest and best option for you. If you could use MLTK, you could do a one-time learning over a huge time span and true this up periodically, but that's out. Also, check out this INCREDIBLE answer by @mmodestino here:

https://answers.splunk.com/answers/511894/how-to-use-the-timewrap-command-and-set-an-alert-f.html

View solution in original post

kiamco
Path Finder

@mmodestino explained it so well Thankss!!!

0 Karma

DalJeanis
SplunkTrust
SplunkTrust

This is a good use case for an accelerated report, accelerated data model or a summary index. If your report is going to be based on summarized 15m increments, then it makes more sense for the system to be calculating each 15m increment once, rather than going back two months to do so.

Start with accelerating the report, which should work for your use case.

ACCELERATED REPORT

https://docs.splunk.com/Documentation/Splunk/7.1.2/Report/Acceleratereports

ACCELERATED DATA MODEL

https://docs.splunk.com/Documentation/Splunk/latest/Knowledge/Acceleratedatamodels

SUMMARY INDEXING

https://docs.splunk.com/Documentation/Splunk/7.1.2/Knowledge/Usesummaryindexing
https://www.splunk.com/view/SP-CAAACZW

kiamco
Path Finder

i thought of using a summary index also but if run a summary index every 15m wouldn't it affect the accuracy of the predict. for example a query with predict that runs for 2 months would get a more accurate prediction compared to a 4 hours prediction, or am I misunderstanding the predict command. I am not sure however hoe the accelerated report works. I have read the documentation but I don't really know how that would solve my issue.

0 Karma

DalJeanis
SplunkTrust
SplunkTrust

@kiamco - The summary index would contain the pre-summarized data. The predict could then run quickly across any length of time, and would not have to analyze the data at the event level ever again, which is what takes the majority of the CPU time.

0 Karma
Take the 2021 Splunk Career Survey

Help us learn about how Splunk has
impacted your career by taking the 2021 Splunk Career Survey.

Earn $50 in Amazon cash!