I have been working on this the last few days, but I am having trouble figuring it out. I'm looking for some pointers on a good way to approach this.
I am trying to set up an alert that will identify when a particular unit within a group is an outlier (say<20%) of the entire group as an average. I hate to use a metaphor instead of showing event data, but I can't do that. A good metaphor is light bulbs in a house.
Each light bulb is reporting event data every 5 minutes, and these events contain the following information:
A typical application will be that a house has 10 light bulbs in it, all reporting the above data to Splunk. I can watch the power consumed by each light over time and see how long that light has been turned on. I can look at all of the lights at once and see which lights are on the most/use the most energy, etc.
All lights should have similar usage throughout the day, so I would expect the average power of all lights used in the house to be similar to any specific light (IE tight grouping of data). If by the end of the day, one of the lights has only used <=20% of the average power of all lights - I want to be alerted so that I can look into that light and make sure it is working.
I have many houses reporting data, but I only care about averaging within the individual houses - not across them.
So some "sample" data for a house:
EVENT_0:
HouseID: 001
LightID: A
TotalPower: 1W
EVENT_1:
HouseID: 001
LightID: B
TotalPower: 5W
EVENT_2:
HouseID: 001
LightID: C
TotalPower: 5W
EVENT_3:
HouseID: 001
LightID: A
TotalPower: 2W
EVENT_4:
HouseID: 001
LightID: B
TotalPower: 10W
EVENT_5:
HouseID: 001
LightID: C
TotalPower: 10W
EVENT_6:
HouseID: 001
LightID: A
TotalPower: 3W
EVENT_7:
HouseID: 001
LightID: B
TotalPower: 20W
EVENT_8:
HouseID: 001
LightID: C
TotalPower: 20W
Now extrapolate this out for a day, and you would see that "LightID: A" is significantly underpacing Light B and C.
How would I go about finding the average "TotalPower" for the entire house => Latest total power value of(A + B +C/3)= 3W +20W+20W/3=~14W and compare each individual lights power for the day to this average?
Note that I only need to look at the latest "TotalPower" at the end of the day as it is an accumulating field.
I have tried messing with some of the stddev stats and calculating upper/lower bounds, but can only seem to make each lightbulb compare to itself - yielding a stdev of 0 for everything.
I think the easiest way to do it is to find the average for the house - then compare it to the average of each lightbulb. But using the BY modifier seems to basically find the average of each lightbulb and compare it to itself. Once again giving me a 0 or 1.
Is "Eventstats" the right way to generate my initial house-wise average before pipelining and then comparing each lightbulb individually?
I appreciate any insight on the right way to approach this. I have taught myself a lot of Splunk in the past month or two - but I am truly confused on this one.
Thanks and let me know if I can clarify anything!
Based on your explanation and my assumptions, the below should work for you
index="test" sourcetype="lights"|table HouseID LightID TotalPower
|bucket span=1d _time
|stats sum(TotalPower) as Total by LightID,HouseID
|eventstats sum(Total) as HouseTotal,avg(Total) as AvgTotal by HouseID
|sort HouseID
My sample events are below. Please adjust the search to match your exact requirement
30 Jan 2016 10:00:00 HouseID="001";LightID="A" TotalPower="1"
30 Jan 2016 10:01:00 HouseID="001" LightID="B" TotalPower="5"
30 Jan 2016 10:02:00 HouseID="001" LightID="C" TotalPower="5"
30 Jan 2016 10:03:00 HouseID="001";LightID="A" TotalPower="1"
30 Jan 2016 10:04:00 HouseID="001" LightID="B" TotalPower="10"
30 Jan 2016 10:05:00 HouseID="001" LightID="C" TotalPower="10"
30 Jan 2016 10:06:00 HouseID="001";LightID="A" TotalPower="3"
30 Jan 2016 10:07:00 HouseID="001" LightID="B" TotalPower="20"
30 Jan 2016 10:08:00 HouseID="001" LightID="C" TotalPower="20"
30 Jan 2016 10:00:00 HouseID="002";LightID="A" TotalPower="1"
30 Jan 2016 10:01:00 HouseID="002" LightID="B" TotalPower="5"
30 Jan 2016 10:02:00 HouseID="002" LightID="C" TotalPower="5"
Based on your explanation and my assumptions, the below should work for you
index="test" sourcetype="lights"|table HouseID LightID TotalPower
|bucket span=1d _time
|stats sum(TotalPower) as Total by LightID,HouseID
|eventstats sum(Total) as HouseTotal,avg(Total) as AvgTotal by HouseID
|sort HouseID
My sample events are below. Please adjust the search to match your exact requirement
30 Jan 2016 10:00:00 HouseID="001";LightID="A" TotalPower="1"
30 Jan 2016 10:01:00 HouseID="001" LightID="B" TotalPower="5"
30 Jan 2016 10:02:00 HouseID="001" LightID="C" TotalPower="5"
30 Jan 2016 10:03:00 HouseID="001";LightID="A" TotalPower="1"
30 Jan 2016 10:04:00 HouseID="001" LightID="B" TotalPower="10"
30 Jan 2016 10:05:00 HouseID="001" LightID="C" TotalPower="10"
30 Jan 2016 10:06:00 HouseID="001";LightID="A" TotalPower="3"
30 Jan 2016 10:07:00 HouseID="001" LightID="B" TotalPower="20"
30 Jan 2016 10:08:00 HouseID="001" LightID="C" TotalPower="20"
30 Jan 2016 10:00:00 HouseID="002";LightID="A" TotalPower="1"
30 Jan 2016 10:01:00 HouseID="002" LightID="B" TotalPower="5"
30 Jan 2016 10:02:00 HouseID="002" LightID="C" TotalPower="5"