All Apps and Add-ons

When forecasting, is it better to remove the outliers or just to transform them?

rosho
Communicator

Hi

I am forecasting the number of logins. I have a dataset with the number of logins for each hour.
First, I use LOF (local outlier factor) to find the outliers and then I remove them.
Second, I use Kalman filter to forecast.

But as you can see in the plot, the prediction (blue) is displaced in time from the logins (red) and also from the future confidence interval (green).

So, would it be better to transform the outliers (maybe make them "normal") rather than completely removing them? Because right now I have some gaps in time. For example (the first monday), I have logins , for each hour, from 00h to 15h and the from 18h until the next day. So there is a gap of 2 hours.

Thank you

alt text

0 Karma

niketn
Legend

@rosho you would need to test several output with your data to confirm whether to include outliers or not. Refer to Splunk Blog on Ensuring Success with Splunk ITSI Adaptive Thresholding Part 3, where it has been mentioned that Quantile is fairly resistant to very large outliers.

In case of predict command do try out older days with holdback to ensure predicted and actual value are more aligned or not.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

DavidHourani
Super Champion

Hi @rosho,

That's a good question...as with anything in ML you gotta test and pick what suits your use-case best.

Including outliers for logins might ruin all the "normal" data that you should use for predictions, but again some outliers are recurring and could be used for predictions to avoid false positives.

What happens if you include the outliers are the results affected heavily ?

Cheers,
David

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...