Splunk Search

ML query to detect categorial outlier per field per day comparing with other day statistics

Janani_Krish
Path Finder

Hello All,

I am trying to find categorial outlier for all the emails sent from our environment with respect to its count per day. My query is as follows,

sourcetype="source" earliest=-2d latest=now() SenderAddress="*@mydomain.com" RecipientAddress!="*@mydomain.com"|timechart span=3d count by SenderAddress limit=0 | anomalydetection  "example@mydomain.com"  action=annotate | eval isOutlier = if(probable_cause != "", "1", "0") | table "example@mydomain.com", probable_cause, isOutlier | sort 100000 probable_cause

But since I am having unlimited email addresses I have given limit=0 in timechart but unable to detect outliers for all the email address unless I specify them like anomalydetection  "example@mydomain.com" example2@mydomain.com  . I have tried something like below,

sourcetype="source" earliest=-2d latest=now() SenderAddress="*@mydomain.com" RecipientAddress!="*@mydomain.com"|timechart span=3d count by SenderAddress limit=0 | anomalydetection  "[search sourcetype="source" earliest=-30d latest=now() SenderAddress="*@mydomain.com" RecipientAddress!="*@mydomain.com"|rename SenderAddress as search|table search|format] action=annotate | eval isOutlier = if(probable_cause != "", "1", "0") | table "example@mydomain.com", probable_cause, isOutlier | sort 100000 probable_cause

But the above is not working. I have also tried with stats command as below but it is detecting overall outlier for last 3 days and is not comparing with specific email address per day,

sourcetype="source" earliest=-2d latest=now() SenderAddress="*@mydomain.com" RecipientAddress!="*@mydomain.com"|bin _time span=1d|stats count by SenderAddress,_time | anomalydetection "SenderAddress" "count" action=annotate | eval isOutlier = if(probable_cause != "", "1", "0") | table "SenderAddress" "count", probable_cause, isOutlier | sort 100000 probable_cause

Please do suggest a way where I can detect categorial outlier for emails sent per email address per day comparing with previous days. For example in the below data,

1@mydomain.com 9th sep 20 34emails

                                        10th sep 20 100emails

3@mydomain.com 9th sep 20 45 emails

                                        15th sep 20 37emails

 

It has to detect 1@mydomain.com on 10th sep as outlier, because comparing with previous day it has sent many emails. 

 

 

 

Labels (2)
0 Karma
Get Updates on the Splunk Community!

Splunk Observability Cloud | Unified Identity - Now Available for Existing Splunk ...

Raise your hand if you’ve already forgotten your username or password when logging into an account. (We can’t ...

Index This | How many sides does a circle have?

February 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

Registration for Splunk University is Now Open!

Are you ready for an adventure in learning?   Brace yourselves because Splunk University is back, and it's ...