Alerting

Machine learning outliers

rrovers
Communicator

I want to use the machine learning toolkit to detect outliers. 

I've made a query with earliest=-2mon@mon latest=@mon to let splunk determine the values for outliers for that period. I want to run the search every day and let the alert send an email when a new outlier is detected since the last run.  

I can't find out how to do this. Every time the search runs it detects all outliers of the last 2 months.

Labels (1)
0 Karma

rrovers
Communicator

Thank you for the information. Your answer is quite extensive and probably usefull to learn more about machine learning.

The thing I want to know is, is if it is possible to let machine learning determine  the lowerbound and upperbound for a long period (for example 2 months or may be even 1 year) and run the search every day as an alert that only gives me the new (since the last day) outliers.

0 Karma

to4kawa
Ultra Champion

https://docs.splunk.com/Documentation/MLApp/5.2.0/User/DNOlegacyassist


I don't think we need to get too hung up on machine learning.

0 Karma

rrovers
Communicator

The functionality seems the same as I used for my alert. The result is not what I'm looking for.

Let's try to clearify it with an example.

In the machine learning app I created an experiment with this simple search to use for this example

index=_internal sourcetype="splunkd_remote_searches" earliest=-1w@w latest=now
| eval day=strftime(_time,"%Y-%m-%d")
| stats count by day 

1 outlier is detected.

After saving this alert I have created an alert from the overview screen (manage - create alert).

My goal is to use the period since last week to determine lower- and upperbound but only receive an alert when there are new outliers since the last run. But now every run over last week the same outlier is detected.

I wonder if it is possible what I want.

 

0 Karma

to4kawa
Ultra Champion

use outputcsv and make query with the csv.

rrovers
Communicator

Hi, thanks for your answer. I gave it a try but missing some information.

Can you please explain a bit more?

0 Karma

rrovers
Communicator

I think I solved it by joining 2 searches.

The first 1 to determine the lowerbound and the upperbound over a long period (last 2 months)

the second 1 to check whether the count of the events of the last day is less than the lowerbound or more than the upperbound I determined in the first one.

0 Karma

2savage
Engager

Aloha @rrovers ,

I think what @to4kawa is saying you should do is to create a lookup file and output your results to that lookup. For example, you would output _time, machinename, and whatever field you believe is valuable using | outputcsv. In turn, you can query the lookup file each day to remove the previous days' outliers. 

I am also going to be experimenting with machine learning and looking to build profiles for users and computers, probably with login patterns at first. Using outputcsv would be a good way to keep track of results, though I'm sure there are other ways to do it. 

 

rrovers
Communicator

Hi @2savage,

I solved it by collecting the daily results to a summary index (I prefer summary indexes above lookups for this kind of functionality). 

I don't think this is part of machine learning but it works fine for me.

0 Karma
Get Updates on the Splunk Community!

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...