Splunk Search
Highlighted

Downtime duration

Motivator

I am looking to build a query for splunk, that analyzes historical instances of down times.

I am doing this with a service state, and there are 4 values. Starting, Running and Stopping, Stopped.
I poll this service every 60 seconds for it's status. I am looking for the duration of the 'down' instances, per day.

As in, the services are in a running state, but they were down for 30 minutes, at this time.

I have a query in the works, but I'm not sure if it's the right fit:

sourcetype=WMI:Service Name=Email2db | streamstats current=false last(State) as last_service_status last(_time) as time_of_change by Name | where State!="last_service_status" | eval outage=now()-time_of_change | eval duration=strftime(outage, "%H:%M") | rename State as current_service_status | table time_of_change, Name, last_service_status, current_service_status, duration

All i'm looking to map is a 'state change' event, and the duration that it remained in that state. if there were two times that the service went down three hours apart, the suits would like to see those individual intances in something as simple as three fields. Those fields are:

Outage Start, Outage Stop, Duration.

0 Karma
Highlighted

Re: Downtime duration

SplunkTrust
SplunkTrust

Try this (add rename at the end per your need)

your base search  | streamstats current=false last(State) as last_service_status last(_time) as time_of_down by Name,host | where State!=last_service_status AND NOT State="Down" | streamstats current=false last(_time) as time_of_up by Name,host  | where isnotnull(time_of_up) | eval duration=time_of_up - time_of_down | convert ctime(time_of_*) | table host, Name, time_of_*,duration
0 Karma
Highlighted

Re: Downtime duration

Motivator

I had to find out where is was failing, and it looks like at the | where segment of the query. If I remove a piece of it, I get results, but they are useless, as there is no duration:

sourcetype=WMI:Service Name=Email2db | streamstats current=false last(State) as lastservicestatus last(time) as timeofdown by Name,host | where State!="Down" | streamstats current=false last(time) as timeofup by Name,host | where isnotnull(timeofup) | eval duration=timeofup - timeofdown | convert ctime(timeof) | table host, Name, timeof,duration

0 Karma
Highlighted

Re: Downtime duration

SplunkTrust
SplunkTrust

Could you post some sample events?

0 Karma