Splunk Search

How to use results of one data set to identify outliers of another data set?

parwindertaank
Explorer

Hi,

I have the average and standard deviation of a particular data set and I want to build a confidence interval from these values and test values of another data set to see if they fall outside of the bounds I created.

index="prototype" sourcetype ="access_combined" clientip=* 
 | iplocation clientip 
 | convert timeformat="%Y-%m-%d" ctime(_time) AS date 
 | stats count by date, Country 
 | eventstats avg(count) as avg_count stdev(count) as stdev_count BY Country

And another search's query as just the count value

index="test3" sourcetype ="access_combined" clientip=* 
 | iplocation clientip 
 | convert timeformat="%Y-%m-%d" ctime(_time) AS date 
 | stats count by date, Country

I want to use

| where count>(avg_count+(2*stdev_count))
Where the count above is from "test3" index and avg_count and stdev_count is from "prototype" index.

And to put it all together in one search

Thanks in advance.

0 Karma
1 Solution

somesoni2
Revered Legend

Give this a try

(index="prototype" OR  index="test3") sourcetype ="access_combined" clientip=* 
| iplocation clientip 
| convert timeformat="%Y-%m-%d" ctime(_time) AS date 
| eval baseCount=if(index="prototype",1,0)
| eval Count=if(index="test3",1,0)
| stats sum(baseCount) as Base sum(Count) as count by date, Country 
| eventstats avg(Base) as avg_count stdev(Base) as stdev_count BY Country | fields - Base
| where count>(avg_count+(2*stdev_count))

View solution in original post

somesoni2
Revered Legend

Give this a try

(index="prototype" OR  index="test3") sourcetype ="access_combined" clientip=* 
| iplocation clientip 
| convert timeformat="%Y-%m-%d" ctime(_time) AS date 
| eval baseCount=if(index="prototype",1,0)
| eval Count=if(index="test3",1,0)
| stats sum(baseCount) as Base sum(Count) as count by date, Country 
| eventstats avg(Base) as avg_count stdev(Base) as stdev_count BY Country | fields - Base
| where count>(avg_count+(2*stdev_count))

parwindertaank
Explorer

This is exactly the result I was looking for!
The if command is something I just learned as well, thanks!

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...