All Apps and Add-ons

Splunk Machine Learning App / Toolkit - Using DBSCAN Clustering Algorithm

Path Finder


I want to use the Clustering Algorithm "DBSCAN" from the Machine Learning Toolkit.
( --> listed under "clustering algorithms"

Now, upon implementation, I noticed, that this algorithm only needs one parameter: EPS
(maximum distance between two samples for them to be considered in the same cluster)

Now if you look up any definition of the DBSCAN Algorithm, for example...
( will notice that a DBSCAN algorithm will need 2 Parameters to be functional:

  • EPS (Epsilon): maximum distance between two samples --> provided
  • minPTS: minimum occurences of samples within a cluster --> missing

Does anybody know, why the second Parameter ist missing?
I Don't get how this algorithm can be functional....

Path Finder

You need to modify $SPLUNK_HOME/etc/apps/Splunk_ML_Toolkit/bin/algos/ file. In __init__ function replace string

out_params = convert_params(options.get('params', {}), floats=['eps'])

with this one:

out_params = convert_params(options.get('params', {}), floats=['eps', 'min_samples'])

After this you can write something like fit DBSCAN eps=0.1 min_samples=2 * in your SPL queries.

0 Karma


@hbrandt84, I concur, scikit learn also mentions two parameters i.e. min_samples and eps (

However, algorithm description and class detail mention that these parameters are optional:

Based on the following code for DBSCAN algorithm, I would expect that initialization default value is min_samples=5 (

def dbscan(X, eps=0.5, min_samples=5, metric='minkowski',
           algorithm='auto', leaf_size=30, p=2, sample_weight=None, n_jobs=1):


def __init__(self, eps=0.5, min_samples=5, metric='euclidean',
             algorithm='auto', leaf_size=30, p=None, n_jobs=1):
    self.eps = eps
    self.min_samples = min_samples
    self.metric = metric
    self.algorithm = algorithm
    self.leaf_size = leaf_size
    self.p = p
    self.n_jobs = n_jobs

However, this needs to be confirmed and possibly enhanced in Machine Learning Toolkit to create a min_samples input parameter for DBSCAN.

| makeresults | eval message= "Happy Splunking!!!"
0 Karma
Get Updates on the Splunk Community!

Index This | A sphere has three, a circle has two, and a point has zero. What is it?

September 2023 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

Build Scalable Security While Moving to Cloud - Guide From Clayton Homes

 Clayton Homes faced the increased challenge of strengthening their security posture as they went through ...

Mission Control | Explore the latest release of Splunk Mission Control (2.3)

We’re happy to announce the release of Mission Control 2.3 which includes several new and exciting features ...