Splunk Search

duration / count of downtime occurences

Engager

Hi,
My data looks like:

SiteID, Date, Time,DeviceID,Alarm

1234,01/01/2013,10:01,1,True

1234,01/01/2013,10:02,1,True

...

If a device is down for 10 min. I would see initial data msg and one msg every 1 min. until the device is up so I would see 11 msgs. There is no data message received when the device is up. When the device is up we would see a gap > 1 minute.

How can I calcuate downtime duration and downtime occurences count each month?

Regards

Martin

Tags (1)

Esteemed Legend
0 Karma

SplunkTrust
SplunkTrust

easiest way is probably with the transaction command, because its maxpause param can easily implement your criterion of "more than 1 minute between events means two different outages".

This search:

... | transaction SiteID DeviceID maxpause="60"

will give you a resultset where each row is a distinct outage.

and from there, you can get the average downtime duration and number of occurrences per month with :

... | transaction SiteID DeviceID maxpause="60" | eval month=strftime(_time,"%m") | stats avg(duration) count by month

Or a timechart showing max outage duration over time, split by site:

... | transaction SiteID DeviceID maxpause="60" | timechart max(duration) by SiteID

or for fun, a frequency distribution of all outage lengths, split by site, with 15 seconds set as the granularity for bucketing our duration values.

... | transaction SiteID DeviceID maxpause="60" | bin duration span=15 | chart count over duration by SiteID

0 Karma

Engager

Thankyou very much.

The above works very nicely.

0 Karma