All Apps and Add-ons

What does : | where prediction!="" AND logins!="" do when detecting outliers?

rosho
Communicator

Hi

I am using the MLTK.
I have a question about the usecase "Detect Numeric Outliers". Specifically line #4.
Why is it important when detecting outliers? I have plotted 2 graphs. Graph 1 uses line #4 and Graph 2 does not.

For me it seems that Graph 2 is the most accurate because it shows the forecast (future_timespan=172) form 30 Nov to 4 Dec. Meanwhile the other one just eliminates those days (it only shows up tp 30 Nov).

1.  | inputlookup logins.csv 
2.  | predict logins as prediction algorithm=LLP future_timespan=172 holdback=36
3.  | eval residual = prediction - logins
4.  | where prediction!="" AND logins!="" 
5.  | table _time, logins prediction residual

USING: where prediction!="" AND logins!=""
USING:

WITHOUT: where prediction!="" AND logins!=""
WITHOUT:

0 Karma
1 Solution

sandeepmakkena
Contributor

You can't really tell which graph is accurate based on forecast timespan.

The only reason why the 1st one is not showing the forecast data because of | where prediction!="" AND logins!="" , the logins will always be null in the feature.

`| where prediction!="" AND logins!=""` with this statement what you're really doing is eliminating the null value of logins and prediction, I'm not sure that's what you wanted.

Hope this helps. Thanks!

View solution in original post

0 Karma

sandeepmakkena
Contributor

You can't really tell which graph is accurate based on forecast timespan.

The only reason why the 1st one is not showing the forecast data because of | where prediction!="" AND logins!="" , the logins will always be null in the feature.

`| where prediction!="" AND logins!=""` with this statement what you're really doing is eliminating the null value of logins and prediction, I'm not sure that's what you wanted.

Hope this helps. Thanks!

0 Karma

rosho
Communicator

Hi

Yes, that line is to avoid nulls. And it should be in position 3, NOT 4. Because "eval" does not function if there are null values. I have used it in other use case.
In this use case, I prefer to remove it because I want to see the "forecast".

 1.  | inputlookup logins.csv 
 2.  | predict logins as prediction algorithm=LLP future_timespan=172 holdback=36
 3.  | where prediction!="" AND logins!="" 
 4.  | eval residual = prediction - logins
 5.  | table _time, logins prediction residual
0 Karma

sandeepmakkena
Contributor

Cool! Thanks

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...