All Apps and Add-ons

What does : | where prediction!="" AND logins!="" do when detecting outliers?

Communicator

Hi

I am using the MLTK.
I have a question about the usecase "Detect Numeric Outliers". Specifically line #4.
Why is it important when detecting outliers? I have plotted 2 graphs. Graph 1 uses line #4 and Graph 2 does not.

For me it seems that Graph 2 is the most accurate because it shows the forecast (future_timespan=172) form 30 Nov to 4 Dec. Meanwhile the other one just eliminates those days (it only shows up tp 30 Nov).

1.  | inputlookup logins.csv 
2.  | predict logins as prediction algorithm=LLP future_timespan=172 holdback=36
3.  | eval residual = prediction - logins
4.  | where prediction!="" AND logins!="" 
5.  | table _time, logins prediction residual

USING: where prediction!="" AND logins!=""
USING:

WITHOUT: where prediction!="" AND logins!=""
WITHOUT:

0 Karma
1 Solution

Contributor

You can't really tell which graph is accurate based on forecast timespan.

The only reason why the 1st one is not showing the forecast data because of | where prediction!="" AND logins!="" , the logins will always be null in the feature.

`| where prediction!="" AND logins!=""` with this statement what you're really doing is eliminating the null value of logins and prediction, I'm not sure that's what you wanted.

Hope this helps. Thanks!

View solution in original post

0 Karma

Contributor

You can't really tell which graph is accurate based on forecast timespan.

The only reason why the 1st one is not showing the forecast data because of | where prediction!="" AND logins!="" , the logins will always be null in the feature.

`| where prediction!="" AND logins!=""` with this statement what you're really doing is eliminating the null value of logins and prediction, I'm not sure that's what you wanted.

Hope this helps. Thanks!

View solution in original post

0 Karma

Communicator

Hi

Yes, that line is to avoid nulls. And it should be in position 3, NOT 4. Because "eval" does not function if there are null values. I have used it in other use case.
In this use case, I prefer to remove it because I want to see the "forecast".

 1.  | inputlookup logins.csv 
 2.  | predict logins as prediction algorithm=LLP future_timespan=172 holdback=36
 3.  | where prediction!="" AND logins!="" 
 4.  | eval residual = prediction - logins
 5.  | table _time, logins prediction residual
0 Karma

Contributor

Cool! Thanks

0 Karma
Don’t Miss Global Splunk
User Groups Week!

Free LIVE events worldwide 2/8-2/12
Connect, learn, and collect rad prizes
and swag!