I'm trying to create a table of availabilities (percent uptime) for a given service for a set of hosts. My desired output is a simple 2-column table of "Host" and "Availability (%)", like the one below:
Host | Availability |
my-db-1 | 100% |
my-db-2 | 97.5% |
my-db-3 | 100% |
my-db-4 | 72.2% |
rhnsd Availabilities
I have a query I currently use to get just availability of a service for a single host, but I'd like to scale it larger to create the above output. It assumes ps.sh is running every 1800 seconds and uses the number of events it finds over a give time period (info_max_time-info_min_time) and divides that by the total number of 1800 second intervals that can fit in the given time period, along with some conditions if no host matches or if the availability is >100. That query is as follows:
index=os host="my-db-1.mydomain.net" sourcetype=ps rhnsd | stats count, distinct_count(host) as hostcount | addinfo | eval availability=if(hostcount=0,0,if(count>=(info_max_time-info_min_time)/1800,100,count/((info_max_time-info_min_time)/1800))*100) | table availability
Or if there's a much easier way to accomplish this that I don't know about, I'm all ears. Any help is greatly appreciated.
Give this a try:
**Fixed malformed if condition
index=os host IN ("my-db-1.mydomain.net",..other hosts..) sourcetype=ps rhnsd | stats count by host | addinfo | eval availability=if(count>=(info_max_time-info_min_time)/1800,100,count/((info_max_time-info_min_time)/1800)*100) | table host availability
Please note that any host that is not reporting data will not appear in the result, not even with 0 as availability. To report availability of hosts that may not be reporting at all, you should create a lookup table with list of those hosts (just single column with name "host" will do) and use something like this.
index=os [| inputlookup yourhostlookup.csv | table host ] sourcetype=ps rhnsd | stats count by host | addinfo | eval availability=if(count>=(info_max_time-info_min_time)/1800,100,count/((info_max_time-info_min_time)/1800)*100) | table host availability
| append [| inputlookup yourhostlookup.csv | table host | eval availability=0]
| stats max(availability) as availability by host
Give this a try:
**Fixed malformed if condition
index=os host IN ("my-db-1.mydomain.net",..other hosts..) sourcetype=ps rhnsd | stats count by host | addinfo | eval availability=if(count>=(info_max_time-info_min_time)/1800,100,count/((info_max_time-info_min_time)/1800)*100) | table host availability
Please note that any host that is not reporting data will not appear in the result, not even with 0 as availability. To report availability of hosts that may not be reporting at all, you should create a lookup table with list of those hosts (just single column with name "host" will do) and use something like this.
index=os [| inputlookup yourhostlookup.csv | table host ] sourcetype=ps rhnsd | stats count by host | addinfo | eval availability=if(count>=(info_max_time-info_min_time)/1800,100,count/((info_max_time-info_min_time)/1800)*100) | table host availability
| append [| inputlookup yourhostlookup.csv | table host | eval availability=0]
| stats max(availability) as availability by host
Thanks! After fixing the malformed if-statement it worked! If you correct yours I can Accept as Solution:
if(count>=(info_max_time-info_min_time)/1800,100,count/((info_max_time-info_min_time)/1800)*100)
Correction made to the answer.
index=os host IN ("my-db-1.mydomain.net","my-db-2.mydomain.net","my-db-3.mydomain.net","my-db-4.mydomain.net") sourcetype=ps rhnsd
| stats count, distinct_count(host) as hostcount
| addinfo
| eval availability=if(hostcount=0,0,if(count>=(info_max_time-info_min_time)/1800,100,count/((info_max_time-info_min_time)/1800))*100)
| table host, availability
Hmmm, this isn't working for me. It's giving me a single row output in Statistics, whereby the host is blank and the availability is a single incorrect value.