All Apps and Add-ons

What does : | where prediction!="" AND logins!="" do when detecting outliers?

rosho
Communicator

Hi

I am using the MLTK.
I have a question about the usecase "Detect Numeric Outliers". Specifically line #4.
Why is it important when detecting outliers? I have plotted 2 graphs. Graph 1 uses line #4 and Graph 2 does not.

For me it seems that Graph 2 is the most accurate because it shows the forecast (future_timespan=172) form 30 Nov to 4 Dec. Meanwhile the other one just eliminates those days (it only shows up tp 30 Nov).

1.  | inputlookup logins.csv 
2.  | predict logins as prediction algorithm=LLP future_timespan=172 holdback=36
3.  | eval residual = prediction - logins
4.  | where prediction!="" AND logins!="" 
5.  | table _time, logins prediction residual

USING: where prediction!="" AND logins!=""
USING:

WITHOUT: where prediction!="" AND logins!=""
WITHOUT:

0 Karma
1 Solution

sandeepmakkena
Contributor

You can't really tell which graph is accurate based on forecast timespan.

The only reason why the 1st one is not showing the forecast data because of | where prediction!="" AND logins!="" , the logins will always be null in the feature.

`| where prediction!="" AND logins!=""` with this statement what you're really doing is eliminating the null value of logins and prediction, I'm not sure that's what you wanted.

Hope this helps. Thanks!

View solution in original post

0 Karma

sandeepmakkena
Contributor

You can't really tell which graph is accurate based on forecast timespan.

The only reason why the 1st one is not showing the forecast data because of | where prediction!="" AND logins!="" , the logins will always be null in the feature.

`| where prediction!="" AND logins!=""` with this statement what you're really doing is eliminating the null value of logins and prediction, I'm not sure that's what you wanted.

Hope this helps. Thanks!

0 Karma

rosho
Communicator

Hi

Yes, that line is to avoid nulls. And it should be in position 3, NOT 4. Because "eval" does not function if there are null values. I have used it in other use case.
In this use case, I prefer to remove it because I want to see the "forecast".

 1.  | inputlookup logins.csv 
 2.  | predict logins as prediction algorithm=LLP future_timespan=172 holdback=36
 3.  | where prediction!="" AND logins!="" 
 4.  | eval residual = prediction - logins
 5.  | table _time, logins prediction residual
0 Karma

sandeepmakkena
Contributor

Cool! Thanks

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Level Up Your .conf25: Splunk Arcade Comes to Boston

With .conf25 right around the corner in Boston, there’s a lot to look forward to — inspiring keynotes, ...

Manual Instrumentation with Splunk Observability Cloud: How to Instrument Frontend ...

Although it might seem daunting, as we’ve seen in this series, manual instrumentation can be straightforward ...

Take Action Automatically on Splunk Alerts with Red Hat Ansible Automation Platform

Ready to make your IT operations smarter and more efficient? Discover how to automate Splunk alerts with Red ...