All Apps and Add-ons

Splunk Machine Learning Toolkit: Density Function Algorithm - What is considered as "anomaly"?

sc2019
New Member

In the output of density function algorithm, Is an anomaly is data which depart from “normalcy”?

For example, if historical response time (for some web sever, say) is 500 milliseconds, then, is it true that a response of 100 milliseconds be considered an anomaly? Well, 100 milliseconds is “better” than 500 milliseconds, is it not? Sure. It’s different than 500 milliseconds, but it’s better because it’s faster. In other words, is there a “mechanism” in Splunk which precludes tagging that ‘pesky’ 100 milliseconds as ‘anomalous’ event? Sure-sure, 5,000 milliseconds is a bona-fide anomalous.

0 Karma

amalekpour_splu
Splunk Employee
Splunk Employee

The "normalcy" of a value is determined by the likelihood of that value occurring according to your training data (past observations). For example, if 98% of your requests see a response time between 1000ms and 5000ms (according to your training data) then a response time between 0ms and 1000ms is only 2% likely, so it might (see below) be marked as anomalous.
Now, the parameter threshold is key here. When you set threshold to say, 0.05 you're telling the algorithm that you think the 5% least likely data points are anomalous. Notice how we're talking about probabilities and not actual values. So, in the above example if you set threshold=0.05 then a latency of 500ms is anomalous (because, remember, any value between 0ms and 1000ms is only 2% likely to occur, according to your training data and the statistical model that DensityFunction created for you).

0 Karma
Get Updates on the Splunk Community!

October Community Champions: A Shoutout to Our Contributors!

As October comes to a close, we want to take a moment to celebrate the people who make the Splunk Community ...

Community Content Calendar, November Edition

Welcome to the November edition of our Community Spotlight! Each month, we dive into the Splunk Community to ...

Stay Connected: Your Guide to November Tech Talks, Office Hours, and Webinars!

What are Community Office Hours? Community Office Hours is an interactive 60-minute Zoom series where ...