Archive
Highlighted

Finding Outliers on event counts

Explorer

I am trying to build a query to find outliers using avg and stdev on a perfmon counter but the counter is not a value you can calculate an average and I can't figure out how to create a count of the counter then calculate the avg and stdev. Here is the query I have so far, mostly based on the Splunk Docs Outlier information

index=perfmon collection=HTTP counter=CurrentConnections host=ServerFarm
|eval time=strftime(_time,"%m/%d/%y %H:%M:%S")
|stats avg(counter) as AvgByHost stdev(counter) as StDevbyHost by host, time
|eval LowerBound=(AvgByHost-StDevByHost2)
|eval UpperBound=(AvgByHost+StDevByHost
2)
|eval isOutlier=if(counter < LowerBound OR counter > UpperBound, 1,0)
|fields time, host, "LowerBound", "UpperBound", "isOutlier"
| sort - isOutlier
|timechart span=1h count(isOutlier) by host

0 Karma
Highlighted

Re: Finding Outliers on event counts

Builder

Looks like you are trying to access "counter" after the stats command. Unfortunately, stats is a transforming command, so you lose the original fields. If counter refers to the event count, then you can add it to your stats line:

|stats count(counter) AS counter avg(counter) as AvgByHost stdev(counter) as StDevbyHost by host, time
0 Karma
Highlighted

Re: Finding Outliers on event counts

Explorer

Thank you - that gets me close it now counts the counter value the avg and stdev are not working

0 Karma
Highlighted

Re: Finding Outliers on event counts

Builder

Please clarify: Are you trying to get a count of events per host per hour? And then looking for those hosts that are outside of some sort of criteria?

0 Karma
Highlighted

Re: Finding Outliers on event counts

Explorer

Sorry I think I should have worded the question better. I am trying to get a count of events by hour and then get the average and standard deviation of that count. Since the field "counter" is not a numeric value just a straight average and standard deviation do not work. I am not sure how to aggregate the event into a number.

0 Karma
Highlighted

Re: Finding Outliers on event counts

Builder

Also check your third line: StDevbyHost should match exactly how it is used below (StDevByHost)

0 Karma
Highlighted

Re: Finding Outliers on event counts

Explorer

thank you. I was able to get the count to work by looking at your example and a few others

index=perfmon collection=HTTP counter=CurrentConnections host=ServerFarm
|bucket _time span=1h
|stats count(counter) as CountByHour by host, _time
|eventstats avg(CountByHour) as AvgByHost stdev(CountByHour) as StDevbyHost by host, _time
|eval LowerBound=(AvgByHost-StDevbyHost2)
|eval UpperBound=(AvgByHost+StDevbyHost
2)
|eval isOutlier=if(AvgByHost < LowerBound OR AvgByHost > UpperBound, 1,0)
|fields time, host, "LowerBound", "UpperBound", "isOutlier", "CountByHour"
| sort - isOutlier

my isOutlier is not working but I think that is because it is comparing against each server node so I have to work on that

0 Karma
Highlighted

Re: Finding Outliers on event counts

SplunkTrust
SplunkTrust

Please provide some sample field values. Why can an average not be calculated?

---
If this reply helps you, an upvote would be appreciated.
0 Karma
Highlighted

Re: Finding Outliers on event counts

Explorer

The query does list the field value counter=CurrentConnections (the CurrentConnections is the value of the field) which is why I am not able to get an average.

0 Karma
Highlighted

Re: Finding Outliers on event counts

Explorer

I was able to get it worked out; here is the file query thank you jpolvino your input helped me work through the issue.

index=perfmon collection=HTTP counter=CurrentConnections host=ServerFarm
|bucket _time span=1h
|stats count(counter) as CountByHour by _time, host
|eventstats avg(CountByHour) as AvgByHost stdev(CountByHour) as StDevbyHost by host
|eval LowerBound=(AvgByHost-StDevbyHost2)
|eval UpperBound=(AvgByHost+StDevbyHost
2)
|eval isOutlier=if(CountByHour < LowerBound OR CountByHour > UpperBound, 1,0)
|fields time, host, "LowerBound", "UpperBound", "isOutlier", "CountByHour", AvgByHost, StDevbyHost
| sort - isOutlier