All Apps and Add-ons

Splunk Machine Learning Toolkit: Density Function Algorithm - What is considered as "anomaly"?

sc2019
New Member

In the output of density function algorithm, Is an anomaly is data which depart from “normalcy”?

For example, if historical response time (for some web sever, say) is 500 milliseconds, then, is it true that a response of 100 milliseconds be considered an anomaly? Well, 100 milliseconds is “better” than 500 milliseconds, is it not? Sure. It’s different than 500 milliseconds, but it’s better because it’s faster. In other words, is there a “mechanism” in Splunk which precludes tagging that ‘pesky’ 100 milliseconds as ‘anomalous’ event? Sure-sure, 5,000 milliseconds is a bona-fide anomalous.

0 Karma

amalekpour_splu
Splunk Employee
Splunk Employee

The "normalcy" of a value is determined by the likelihood of that value occurring according to your training data (past observations). For example, if 98% of your requests see a response time between 1000ms and 5000ms (according to your training data) then a response time between 0ms and 1000ms is only 2% likely, so it might (see below) be marked as anomalous.
Now, the parameter threshold is key here. When you set threshold to say, 0.05 you're telling the algorithm that you think the 5% least likely data points are anomalous. Notice how we're talking about probabilities and not actual values. So, in the above example if you set threshold=0.05 then a latency of 500ms is anomalous (because, remember, any value between 0ms and 1000ms is only 2% likely to occur, according to your training data and the statistical model that DensityFunction created for you).

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

Introduction to Splunk AI

How are you using AI in Splunk? Whether you see AI as a threat or opportunity, AI is here to stay. Lucky for ...

Splunk + ThousandEyes: Correlate frontend, app, and network data to troubleshoot ...

Are you tired of troubleshooting delays caused by siloed frontend, application, and network data? We've got a ...

Maximizing the Value of Splunk ES 8.x

Splunk Enterprise Security (ES) continues to be a leader in the Gartner Magic Quadrant, reflecting its pivotal ...