All Apps and Add-ons

Splunk Machine Learning Toolkit: Density Function Algorithm - What is considered as "anomaly"?

sc2019
New Member

In the output of density function algorithm, Is an anomaly is data which depart from “normalcy”?

For example, if historical response time (for some web sever, say) is 500 milliseconds, then, is it true that a response of 100 milliseconds be considered an anomaly? Well, 100 milliseconds is “better” than 500 milliseconds, is it not? Sure. It’s different than 500 milliseconds, but it’s better because it’s faster. In other words, is there a “mechanism” in Splunk which precludes tagging that ‘pesky’ 100 milliseconds as ‘anomalous’ event? Sure-sure, 5,000 milliseconds is a bona-fide anomalous.

0 Karma

amalekpour_splu
Splunk Employee
Splunk Employee

The "normalcy" of a value is determined by the likelihood of that value occurring according to your training data (past observations). For example, if 98% of your requests see a response time between 1000ms and 5000ms (according to your training data) then a response time between 0ms and 1000ms is only 2% likely, so it might (see below) be marked as anomalous.
Now, the parameter threshold is key here. When you set threshold to say, 0.05 you're telling the algorithm that you think the 5% least likely data points are anomalous. Notice how we're talking about probabilities and not actual values. So, in the above example if you set threshold=0.05 then a latency of 500ms is anomalous (because, remember, any value between 0ms and 1000ms is only 2% likely to occur, according to your training data and the statistical model that DensityFunction created for you).

0 Karma
Get Updates on the Splunk Community!

Tech Talk Recap | Mastering Threat Hunting

Mastering Threat HuntingDive into the world of threat hunting, exploring the key differences between ...

Observability for AI Applications: Troubleshooting Latency

If you’re working with proprietary company data, you’re probably going to have a locally hosted LLM or many ...

Splunk AI Assistant for SPL vs. ChatGPT: Which One is Better?

In the age of AI, every tool promises to make our lives easier. From summarizing content to writing code, ...