About Nils

Nils · ‎04-15-2021

Hi! I have a data set consisting of a csv-file with three columns with numerical data. I have performed my own implementation that clusters the data set with K-means and then calculates outliers based on euclidean distance between data points and the cluster centroids. I wan't to perform the same kind of operation in Splunk but have not been successfull so far. I have tried local outlier factor, with the following query in search: source="dataset.csv" | fit LocalOutlierFactor 0,1,2 | search isOutlier="1.0" However, the result from this search is very poor since very few outliers are detected. The data set is labeled making it easy to see correctly classified outliers. I have also tried with "Detect numeric outliers" from the machine learning toolkit but there, I can only chose one field to analyze and I have three fields. Is there an optimal solution to the problem of finding outliers in this type of dataset? Thanks in advance!

Posts	1
Solutions	0
Karma Given	0
Karma Received	0
Member Since	‎04-15-2021

Online Status	Offline
Date Last Visited	‎04-22-2021 07:57 AM

Outlier detection

Outlier detection

Join the Conversation

Outlier detection

Outlier detection