Splunk Search

How to use results of one data set to identify outliers of another data set?

parwindertaank
Explorer

Hi,

I have the average and standard deviation of a particular data set and I want to build a confidence interval from these values and test values of another data set to see if they fall outside of the bounds I created.

index="prototype" sourcetype ="access_combined" clientip=* 
 | iplocation clientip 
 | convert timeformat="%Y-%m-%d" ctime(_time) AS date 
 | stats count by date, Country 
 | eventstats avg(count) as avg_count stdev(count) as stdev_count BY Country

And another search's query as just the count value

index="test3" sourcetype ="access_combined" clientip=* 
 | iplocation clientip 
 | convert timeformat="%Y-%m-%d" ctime(_time) AS date 
 | stats count by date, Country

I want to use

| where count>(avg_count+(2*stdev_count))
Where the count above is from "test3" index and avg_count and stdev_count is from "prototype" index.

And to put it all together in one search

Thanks in advance.

0 Karma
1 Solution

somesoni2
Revered Legend

Give this a try

(index="prototype" OR  index="test3") sourcetype ="access_combined" clientip=* 
| iplocation clientip 
| convert timeformat="%Y-%m-%d" ctime(_time) AS date 
| eval baseCount=if(index="prototype",1,0)
| eval Count=if(index="test3",1,0)
| stats sum(baseCount) as Base sum(Count) as count by date, Country 
| eventstats avg(Base) as avg_count stdev(Base) as stdev_count BY Country | fields - Base
| where count>(avg_count+(2*stdev_count))

View solution in original post

somesoni2
Revered Legend

Give this a try

(index="prototype" OR  index="test3") sourcetype ="access_combined" clientip=* 
| iplocation clientip 
| convert timeformat="%Y-%m-%d" ctime(_time) AS date 
| eval baseCount=if(index="prototype",1,0)
| eval Count=if(index="test3",1,0)
| stats sum(baseCount) as Base sum(Count) as count by date, Country 
| eventstats avg(Base) as avg_count stdev(Base) as stdev_count BY Country | fields - Base
| where count>(avg_count+(2*stdev_count))

parwindertaank
Explorer

This is exactly the result I was looking for!
The if command is something I just learned as well, thanks!

0 Karma
Get Updates on the Splunk Community!

Observability Unlocked: Kubernetes Monitoring with Splunk Observability Cloud

 Ready to master Kubernetes and cloud monitoring like the pros? Join Splunk’s Growth Engineering team for an ...

Update Your SOAR Apps for Python 3.13: What Community Developers Need to Know

To Community SOAR App Developers - we're reaching out with an important update regarding Python 3.9's ...

October Community Champions: A Shoutout to Our Contributors!

As October comes to a close, we want to take a moment to celebrate the people who make the Splunk Community ...