The data: The simplest example is signal strength per station attached to wireless access point. Every 5 minutes, signal strength from all attached stations is recorded. Use 3db buckets.
I can create a histogram of ALL signal strengths:
index=* sourcetype=my-csv | chart count over Signal span=3
AND I can create a histogram that breaks up the data by StationName (each bin has one bar per StationName):
index=* sourcetype=my-csv |chart count over Signal span=3 by StationName
Both of these work fine, but what I need is to NORMALIZE the data for each station.
Right now, the bar height is the total count of signal strengths that fall into that bucket. That is fine, except some stations have more connect time than others and the ABSOLUTE count is actually unimportant, it is the RELATIVE count I am after.
What I need to do is divide each of the "count" metrics for a given station by the total of all counts for that station (NOT the total count overall, just the total for that station). But I have not been able to figure out how to do this.
Example data:
Name Signal count Name Signal count
station1 -28 2 station2 -28 3
station1 -29 9 station2 -29 8
station1 -30 33 station2 -30 8
station1 -31 73 station2 -31 11
station1 -32 189 station2 -32 7
station1 -33 604 station2 -33 15
station1 -34 504 station2 -34 16
station1 -35 23 station2 -35 3
station1 -36 10 station2 -36 1
station1 -37 13 station2 -37 1
station1 -38 3 station2 -38 1
station1 -39 4 station2 -39 2
station1 -40 2 station2 -40 1
station1 -41 1 station2 -41 1
station1 -42 1 station2 -42 5
sum of "count" (total measurements) for station1=1471
sum of "count" (total measurements) for station2=83
What I want in the histogram is something like this where each "count" above is divided by the total for its station:
Name Signal %ofTotal Name Signal %ofTotal
station1 -28 0.1% station2 -28 3.6%
station1 -29 0.6% station2 -29 9.6%
station1 -30 2.2% station2 -30 9.6%
station1 -31 5.0% station2 -31 13.3%
station1 -32 12.8% station2 -32 8.4%
station1 -33 41.1% station2 -33 18.1%
station1 -34 34.3% station2 -34 19.3%
station1 -35 1.6% station2 -35 3.6%
station1 -36 0.7% station2 -36 1.2%
station1 -37 0.9% station2 -37 1.2%
station1 -38 0.2% station2 -38 1.2%
station1 -39 0.3% station2 -39 2.4%
station1 -40 0.1% station2 -40 1.2%
station1 -41 0.1% station2 -41 1.2%
station1 -42 0.1% station2 -42 6.0%
I am relatively new to Splunk....I have searched but will feel duly humbled if someone posts a link with a "did you see this example of exactly what you are trying to do?" 🙂
index=* sourcetype=my-csv | chart count over Signal | eventstats sum(count) as TotalStationCount by Name | eval PercofCount = (count/TotalStationCount)*100
index=* sourcetype=my-csv | chart count over Signal | eventstats sum(count) as TotalStationCount by Name | eval PercofCount = (count/TotalStationCount)*100
I am going to accept this answer as it got me closer to my goal, but I had to do some major reworking to get it to work.
The query above does not apper to give me any different results that what I get without the eventstats commands.
However, if I do the eventstats BEFORE the chart command and chart the correct value, I get what I was looking for.
Here is my final query:
index=* sourcetype=my-csv |
stats count(Signal) as bySignal by StationName,Signal |
eventstats sum(bySignal) as SumofCount by StationName |
eval PercofCount = (bySignal/SumofCount)*100 |
chart sum(PercofCount) over Signal span=3 by StationName
If you do a "table StationName,Signal,SumofCount,PercofCount" instead of the chart command you will see what is happening.
One note: "Signal" is not continuous, but is a range of whole integers from 0 to -101.
Basically, eventstats added the "SumofCount" metric to each line of the base table (where it is count per Signal value) just as advertised. Then you can do the eval statement to get the percentage for each Signal value for each StationName, and finally chart the SUM of the PercofCount for Signal binned in 3db bins.
Thanks!
Eventstats is useful for this scenario
Your query would be something like
<base query here> | eventstats sum(count) as SumofCount by Signal | eval PercofCount = (count/SumofCount)*100