All Apps and Add-ons

Splunk Machine Learning App / Toolkit - Using DBSCAN Clustering Algorithm

Path Finder


I want to use the Clustering Algorithm "DBSCAN" from the Machine Learning Toolkit.
( --> listed under "clustering algorithms"

Now, upon implementation, I noticed, that this algorithm only needs one parameter: EPS
(maximum distance between two samples for them to be considered in the same cluster)

Now if you look up any definition of the DBSCAN Algorithm, for example...
( will notice that a DBSCAN algorithm will need 2 Parameters to be functional:

  • EPS (Epsilon): maximum distance between two samples --> provided
  • minPTS: minimum occurences of samples within a cluster --> missing

Does anybody know, why the second Parameter ist missing?
I Don't get how this algorithm can be functional....

Path Finder

You need to modify $SPLUNK_HOME/etc/apps/Splunk_ML_Toolkit/bin/algos/ file. In __init__ function replace string

out_params = convert_params(options.get('params', {}), floats=['eps'])

with this one:

out_params = convert_params(options.get('params', {}), floats=['eps', 'min_samples'])

After this you can write something like fit DBSCAN eps=0.1 min_samples=2 * in your SPL queries.

0 Karma


@hbrandt84, I concur, scikit learn also mentions two parameters i.e. min_samples and eps (

However, algorithm description and class detail mention that these parameters are optional:

Based on the following code for DBSCAN algorithm, I would expect that initialization default value is min_samples=5 (

def dbscan(X, eps=0.5, min_samples=5, metric='minkowski',
           algorithm='auto', leaf_size=30, p=2, sample_weight=None, n_jobs=1):


def __init__(self, eps=0.5, min_samples=5, metric='euclidean',
             algorithm='auto', leaf_size=30, p=None, n_jobs=1):
    self.eps = eps
    self.min_samples = min_samples
    self.metric = metric
    self.algorithm = algorithm
    self.leaf_size = leaf_size
    self.p = p
    self.n_jobs = n_jobs

However, this needs to be confirmed and possibly enhanced in Machine Learning Toolkit to create a min_samples input parameter for DBSCAN.

| makeresults | eval message= "Happy Splunking!!!"
0 Karma
Get Updates on the Splunk Community!

Admin Your Splunk Cloud, Your Way

Join us to maximize different techniques to best tune Splunk Cloud. In this Tech Enablement, you will get ...

Cloud Platform | Discontinuing support for TLS version 1.0 and 1.1

Overview Transport Layer Security (TLS) is a security communications protocol that lets two computers, ...

New Customer Testimonials

Enterprises of all sizes and across different industries are accelerating cloud adoption by migrating ...