Splunk Search

ML query to detect categorial outlier per field per day comparing with other day statistics

Janani_Krish
Path Finder

Hello All,

I am trying to find categorial outlier for all the emails sent from our environment with respect to its count per day. My query is as follows,

sourcetype="source" earliest=-2d latest=now() SenderAddress="*@mydomain.com" RecipientAddress!="*@mydomain.com"|timechart span=3d count by SenderAddress limit=0 | anomalydetection  "example@mydomain.com"  action=annotate | eval isOutlier = if(probable_cause != "", "1", "0") | table "example@mydomain.com", probable_cause, isOutlier | sort 100000 probable_cause

But since I am having unlimited email addresses I have given limit=0 in timechart but unable to detect outliers for all the email address unless I specify them like anomalydetection  "example@mydomain.com" example2@mydomain.com  . I have tried something like below,

sourcetype="source" earliest=-2d latest=now() SenderAddress="*@mydomain.com" RecipientAddress!="*@mydomain.com"|timechart span=3d count by SenderAddress limit=0 | anomalydetection  "[search sourcetype="source" earliest=-30d latest=now() SenderAddress="*@mydomain.com" RecipientAddress!="*@mydomain.com"|rename SenderAddress as search|table search|format] action=annotate | eval isOutlier = if(probable_cause != "", "1", "0") | table "example@mydomain.com", probable_cause, isOutlier | sort 100000 probable_cause

But the above is not working. I have also tried with stats command as below but it is detecting overall outlier for last 3 days and is not comparing with specific email address per day,

sourcetype="source" earliest=-2d latest=now() SenderAddress="*@mydomain.com" RecipientAddress!="*@mydomain.com"|bin _time span=1d|stats count by SenderAddress,_time | anomalydetection "SenderAddress" "count" action=annotate | eval isOutlier = if(probable_cause != "", "1", "0") | table "SenderAddress" "count", probable_cause, isOutlier | sort 100000 probable_cause

Please do suggest a way where I can detect categorial outlier for emails sent per email address per day comparing with previous days. For example in the below data,

1@mydomain.com 9th sep 20 34emails

                                        10th sep 20 100emails

3@mydomain.com 9th sep 20 45 emails

                                        15th sep 20 37emails

 

It has to detect 1@mydomain.com on 10th sep as outlier, because comparing with previous day it has sent many emails. 

 

 

 

Labels (2)
0 Karma
Take the 2021 Splunk Career Survey

Help us learn about how Splunk has
impacted your career by taking the 2021 Splunk Career Survey.

Earn $50 in Amazon cash!