Knowledge Management

Prediction Function Algorithms Questions

lfetky
New Member

I'm currently trying to translate Splunk functions into SAS, and was hoping for some clarification on the prediction function.

  1. Are the algorithms Splunk uses for the prediction models proprietary? If not, is there any further documentation/explanation concerning the predict function algorithms Splunk uses? We are hoping to replicate the predict function analysis from Splunk in SAS, and want to be sure we fully understand the step-by-step calculations Splunk uses as we do so.
    1. How do we interpret the following prediction algorithms: LL, LLP, LLT, LLB?
    2. How do we interpret the lower 95 and upper 95 (prediction count)?
    3. Can you please give us a real-world example where “malware” fell above or below the predicted value’s confidence interval based on the dataset used and the time series model utilized?

Thanks!

0 Karma

nnguyen_splunk
Splunk Employee
Splunk Employee
  1. All the algorithms are based on the Kalman filter which is not proprietary. However, some of the variations we come up with are proprietary. Explanation of the Kalman filter can be found in the outside literature. One of the books I found useful while implementing the predict command is "An Introduction to State Space Time Series Analysis" by Commandeur-Koopman.
  2. Local Level (LL): this a univariate model with no trends and no seasonaility. Seasonal Local Level (LLP): this is a univariate model with seasonality. The periodicity of the time series is automatically computed. Local Level Trend (LLT): this is a univariate model with trend but no seasonality. Bivariate Local Level (LLB): this is a bivariate model with no trends and no seasonality.
    1. The lower 95 and upper 95 specifies a confidence interval in which we expect 95% of the predictions to fall.
    2. Most of the time SOME of the predictions will fall outside the confidence interval. That is normal because 1. the confidence interval does not cover 100% of the predictions and 2. the confidence interval is about a probabilistic expectation and things don't match the expectation exactly.
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

[Puzzles] Solve, Learn, Repeat: Character substitutions with Regular Expressions

This challenge was first posted on Slack #puzzles channelFor BORE at .conf23, we had a puzzle question which ...

Splunk Community Badges!

  Hey everyone! Ready to earn some serious bragging rights in the community? Along with our existing badges ...

[Puzzles] Solve, Learn, Repeat: Matching cron expressions

This puzzle (first published here) is based on matching timestamps to cron expressions.All the timestamps ...