Other Usage

How to customize external function for normal distribution?

POR160893
Builder

Hi,

I am doing statistical analysis on a number of indexes for time series forecasting.

On reading the following article, its gives a sample SPL query as follows:
| gentimes start=”01/01/2018" increment=1h
| eval _time=starttime, loc=0, scale=20
| normal loc=loc scale=scale
| streamstats count as cnt
| eval gen_normal = gen_normal + cnt
| table _time, gen_normal
| rename gen_normal as “Non-stationary time series (trend)”

[Article is this: ]https://towardsdatascience.com/time-series-forecasting-with-splunk-part-i-intro-kalman-filter-46e4bf...

The "normal" command is a cutom external command and I wanted to ask how and where I can get such statistical functions into Splunk?


Many thanks as always,

0 Karma
1 Solution

tscroggins
Influencer

As an aside, I don't know of any generally available statistical package for Splunk that contains generating commands for commonly used distributions. I write macros as needed. For example (with no guarantee of correctness!):

 

# macros.conf

[expinv(2)]
args = p,b
definition = "exact(-(1/$b$)*ln(1-$p$))"
iseval = 1

[lognorminv(3)]
args = p,u,s
definition = "exact(exp($u$ + $s$ * if($p$ < 0.5, -1 * (sqrt(-2.0 * ln($p$)) - ((0.010328 * sqrt(-2.0 * ln($p$)) + 0.802853) * sqrt(-2.0 * ln($p$)) + 2.515517) / (((0.001308 * sqrt(-2.0 * ln($p$)) + 0.189269) * sqrt(-2.0 * ln($p$)) + 1.432788) * sqrt(-2.0 * ln($p$)) + 1.0)), (sqrt(-2.0 * ln(1 - $p$)) - ((0.010328 * sqrt(-2.0 * ln(1 - $p$)) + 0.802853) * sqrt(-2.0 * ln(1 - $p$)) + 2.515517) / (((0.001308 * sqrt(-2.0 * ln(1 - $p$)) + 0.189269) * sqrt(-2.0 * ln(1 - $p$)) + 1.432788) * sqrt(-2.0 * ln(1 - $p$)) + 1.0)))))"
iseval = 1

[weibullinv(3)]
args = p,a,b
definition = "exact($a$*pow(-ln(1-$p$),1/$b$))"
iseval = 1

 

 

View solution in original post

tscroggins
Influencer

Hi,

I read the same article several years ago and created a macros similar to Excel's NORMINV and RAND just for this purpose:

 

# macros.conf

[norminv(3)]
args = p,u,s
definition = "exact($u$ + $s$ * if($p$ < 0.5, -1 * (sqrt(-2.0 * ln($p$)) - ((0.010328 * sqrt(-2.0 * ln($p$)) + 0.802853) * sqrt(-2.0 * ln($p$)) + 2.515517) / (((0.001308 * sqrt(-2.0 * ln($p$)) + 0.189269) * sqrt(-2.0 * ln($p$)) + 1.432788) * sqrt(-2.0 * ln($p$)) + 1.0)), (sqrt(-2.0 * ln(1 - $p$)) - ((0.010328 * sqrt(-2.0 * ln(1 - $p$)) + 0.802853) * sqrt(-2.0 * ln(1 - $p$)) + 2.515517) / (((0.001308 * sqrt(-2.0 * ln(1 - $p$)) + 0.189269) * sqrt(-2.0 * ln(1 - $p$)) + 1.432788) * sqrt(-2.0 * ln(1 - $p$)) + 1.0))))"
iseval = 1

[rand]
definition = "random()/2147483647"
iseval = 1

 

There's further discussion of the macro implementation in a previous post: https://community.splunk.com/t5/Splunk-Search/Outlier-Dip-Trough-Detection/m-p/550122/highlight/true... 

To recreate the toy example from the original article:

| gentimes start="01/01/2018" end="01/22/2018" increment=1h 
| eval _time=starttime, loc=0, scale=20 
| eval gen_normal=`norminv("`rand()`", loc, scale)` 
| streamstats count as cnt 
| eval gen_normal=gen_normal+cnt
| table _time gen_normal
| rename gen_normal as "Non-stationary time series (trend)"
| predict algorithm=LLT future_timespan=200 "Non-stationary time series (trend)"
0 Karma

tscroggins
Influencer

As an aside, I don't know of any generally available statistical package for Splunk that contains generating commands for commonly used distributions. I write macros as needed. For example (with no guarantee of correctness!):

 

# macros.conf

[expinv(2)]
args = p,b
definition = "exact(-(1/$b$)*ln(1-$p$))"
iseval = 1

[lognorminv(3)]
args = p,u,s
definition = "exact(exp($u$ + $s$ * if($p$ < 0.5, -1 * (sqrt(-2.0 * ln($p$)) - ((0.010328 * sqrt(-2.0 * ln($p$)) + 0.802853) * sqrt(-2.0 * ln($p$)) + 2.515517) / (((0.001308 * sqrt(-2.0 * ln($p$)) + 0.189269) * sqrt(-2.0 * ln($p$)) + 1.432788) * sqrt(-2.0 * ln($p$)) + 1.0)), (sqrt(-2.0 * ln(1 - $p$)) - ((0.010328 * sqrt(-2.0 * ln(1 - $p$)) + 0.802853) * sqrt(-2.0 * ln(1 - $p$)) + 2.515517) / (((0.001308 * sqrt(-2.0 * ln(1 - $p$)) + 0.189269) * sqrt(-2.0 * ln(1 - $p$)) + 1.432788) * sqrt(-2.0 * ln(1 - $p$)) + 1.0)))))"
iseval = 1

[weibullinv(3)]
args = p,a,b
definition = "exact($a$*pow(-ln(1-$p$),1/$b$))"
iseval = 1

 

 

POR160893
Builder

Perfect! Thank you so much for this information.

Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

Observe and Secure All Apps with Splunk

 Join Us for Our Next Tech Talk: Observe and Secure All Apps with SplunkAs organizations continue to innovate ...

What's New in Splunk Observability - August 2025

What's New We are excited to announce the latest enhancements to Splunk Observability Cloud as well as what is ...

Introduction to Splunk AI

How are you using AI in Splunk? Whether you see AI as a threat or opportunity, AI is here to stay. Lucky for ...