Splunk Search

How to find outliers relative to a group average?

splunkyouverymu
Explorer

I have been working on this the last few days, but I am having trouble figuring it out. I'm looking for some pointers on a good way to approach this.

I am trying to set up an alert that will identify when a particular unit within a group is an outlier (say<20%) of the entire group as an average. I hate to use a metaphor instead of showing event data, but I can't do that. A good metaphor is light bulbs in a house.

Each light bulb is reporting event data every 5 minutes, and these events contain the following information:

  • House ID#
  • Lightbulb ID#
  • total power consumed by light (this value accumulates indefinitely as the light bulb is turned on)
  • Timestamp

A typical application will be that a house has 10 light bulbs in it, all reporting the above data to Splunk. I can watch the power consumed by each light over time and see how long that light has been turned on. I can look at all of the lights at once and see which lights are on the most/use the most energy, etc.

All lights should have similar usage throughout the day, so I would expect the average power of all lights used in the house to be similar to any specific light (IE tight grouping of data). If by the end of the day, one of the lights has only used <=20% of the average power of all lights - I want to be alerted so that I can look into that light and make sure it is working.

I have many houses reporting data, but I only care about averaging within the individual houses - not across them.

So some "sample" data for a house:

EVENT_0:
HouseID: 001
LightID: A
TotalPower: 1W

EVENT_1:
HouseID: 001
LightID: B
TotalPower: 5W

EVENT_2:
HouseID: 001
LightID: C
TotalPower: 5W

EVENT_3:
HouseID: 001
LightID: A
TotalPower: 2W

EVENT_4:
HouseID: 001
LightID: B
TotalPower: 10W

EVENT_5:
HouseID: 001
LightID: C
TotalPower: 10W

EVENT_6:
HouseID: 001
LightID: A
TotalPower: 3W

EVENT_7:
HouseID: 001
LightID: B
TotalPower: 20W

EVENT_8:
HouseID: 001
LightID: C
TotalPower: 20W

Now extrapolate this out for a day, and you would see that "LightID: A" is significantly underpacing Light B and C.

How would I go about finding the average "TotalPower" for the entire house => Latest total power value of(A + B +C/3)= 3W +20W+20W/3=~14W and compare each individual lights power for the day to this average?

Note that I only need to look at the latest "TotalPower" at the end of the day as it is an accumulating field.

I have tried messing with some of the stddev stats and calculating upper/lower bounds, but can only seem to make each lightbulb compare to itself - yielding a stdev of 0 for everything.

I think the easiest way to do it is to find the average for the house - then compare it to the average of each lightbulb. But using the BY modifier seems to basically find the average of each lightbulb and compare it to itself. Once again giving me a 0 or 1.

Is "Eventstats" the right way to generate my initial house-wise average before pipelining and then comparing each lightbulb individually?

I appreciate any insight on the right way to approach this. I have taught myself a lot of Splunk in the past month or two - but I am truly confused on this one.

Thanks and let me know if I can clarify anything!

1 Solution

renjith_nair
SplunkTrust
SplunkTrust

Based on your explanation and my assumptions, the below should work for you

index="test" sourcetype="lights"|table HouseID LightID TotalPower
|bucket span=1d _time
|stats sum(TotalPower) as Total by LightID,HouseID
|eventstats sum(Total) as HouseTotal,avg(Total) as AvgTotal by HouseID
|sort HouseID

My sample events are below. Please adjust the search to match your exact requirement

30 Jan 2016 10:00:00 HouseID="001";LightID="A" TotalPower="1"
30 Jan 2016 10:01:00 HouseID="001" LightID="B" TotalPower="5"
30 Jan 2016 10:02:00 HouseID="001" LightID="C" TotalPower="5"
30 Jan 2016 10:03:00 HouseID="001";LightID="A" TotalPower="1"
30 Jan 2016 10:04:00 HouseID="001" LightID="B" TotalPower="10"
30 Jan 2016 10:05:00 HouseID="001" LightID="C" TotalPower="10"
30 Jan 2016 10:06:00 HouseID="001";LightID="A" TotalPower="3"
30 Jan 2016 10:07:00 HouseID="001" LightID="B" TotalPower="20"
30 Jan 2016 10:08:00 HouseID="001" LightID="C" TotalPower="20"

30 Jan 2016 10:00:00 HouseID="002";LightID="A" TotalPower="1"
30 Jan 2016 10:01:00 HouseID="002" LightID="B" TotalPower="5"
30 Jan 2016 10:02:00 HouseID="002" LightID="C" TotalPower="5"
Happy Splunking!

View solution in original post

renjith_nair
SplunkTrust
SplunkTrust

Based on your explanation and my assumptions, the below should work for you

index="test" sourcetype="lights"|table HouseID LightID TotalPower
|bucket span=1d _time
|stats sum(TotalPower) as Total by LightID,HouseID
|eventstats sum(Total) as HouseTotal,avg(Total) as AvgTotal by HouseID
|sort HouseID

My sample events are below. Please adjust the search to match your exact requirement

30 Jan 2016 10:00:00 HouseID="001";LightID="A" TotalPower="1"
30 Jan 2016 10:01:00 HouseID="001" LightID="B" TotalPower="5"
30 Jan 2016 10:02:00 HouseID="001" LightID="C" TotalPower="5"
30 Jan 2016 10:03:00 HouseID="001";LightID="A" TotalPower="1"
30 Jan 2016 10:04:00 HouseID="001" LightID="B" TotalPower="10"
30 Jan 2016 10:05:00 HouseID="001" LightID="C" TotalPower="10"
30 Jan 2016 10:06:00 HouseID="001";LightID="A" TotalPower="3"
30 Jan 2016 10:07:00 HouseID="001" LightID="B" TotalPower="20"
30 Jan 2016 10:08:00 HouseID="001" LightID="C" TotalPower="20"

30 Jan 2016 10:00:00 HouseID="002";LightID="A" TotalPower="1"
30 Jan 2016 10:01:00 HouseID="002" LightID="B" TotalPower="5"
30 Jan 2016 10:02:00 HouseID="002" LightID="C" TotalPower="5"
Happy Splunking!
Get Updates on the Splunk Community!

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...