I'm working to deploy Splunk in an HPC environment and am trying to set up some metrics queries that I didn't see in the Splunk for *nix app. Specifically I'd like have a timechart that show cpu utilization per day for the month where the units are CPU/Hours. (ie. 1 CPU with 8 cores has 192 CPU hours per day). I'm pretty sure I need to use streamstats to get the daily values, but I'm having trouble figuring out how to get the data into the units I want.
Thanks for your insight.
Edit #1: Here's a better example of the metric I'm trying to get.
Sample system: 16 total cores
CPU-Hours per day = 16(cores)*24(hours) = 384
So if cpu.sh gives you the PercentIdle for each core at that instant you'd need to take the
time that has passed since the last measurement for a core and multiply that by the current
PercentIdle and divide by 100.
Example:
event 1: core 0 97%Idle, core 1 98%Idle time 00:00:00
event 2: core 0 97%Idle, core 1 97%Idle time 00:01:00
for core 0:
(60sec*3%)/100 = 1.8 CPU-Seconds = .0005 CPU-Hours
You'd add these values up in 1 day time spans.
edit #2
I'd be okay with an answer like this (but actually worked, this is broke). I think I was making this more complicated than it needed to be.:
index=os sourcetype=cpu host=x | eval percent_used=(100-pctIdle)|stats avg(percent_used) AS cpu_avg_used by CPU | timechart span=1d sum(((cpu_avg_used*24)/100))
Still new to writing queries.
... View more