My data looks like:
SiteID, Date, Time,DeviceID,Alarm
If a device is down for 10 min. I would see initial data msg and one msg every 1 min. until the device is up so I would see 11 msgs. There is no data message received when the device is up. When the device is up we would see a gap > 1 minute.
How can I calcuate downtime duration and downtime occurences count each month?
easiest way is probably with the
transaction command, because its
maxpause param can easily implement your criterion of "more than 1 minute between events means two different outages".
... | transaction SiteID DeviceID maxpause="60"
will give you a resultset where each row is a distinct outage.
and from there, you can get the average downtime duration and number of occurrences per month with :
... | transaction SiteID DeviceID maxpause="60" | eval month=strftime(_time,"%m") | stats avg(duration) count by month
Or a timechart showing max outage duration over time, split by site:
... | transaction SiteID DeviceID maxpause="60" | timechart max(duration) by SiteID
or for fun, a frequency distribution of all outage lengths, split by site, with 15 seconds set as the granularity for bucketing our duration values.
... | transaction SiteID DeviceID maxpause="60" | bin duration span=15 | chart count over duration by SiteID