Splunk Search

Downtime duration

tmarlette
Motivator

I am looking to build a query for splunk, that analyzes historical instances of down times.

I am doing this with a service state, and there are 4 values. Starting, Running and Stopping, Stopped.
I poll this service every 60 seconds for it's status. I am looking for the duration of the 'down' instances, per day.

As in, the services are in a running state, but they were down for 30 minutes, at this time.

I have a query in the works, but I'm not sure if it's the right fit:

sourcetype=WMI:Service Name=Email2db | streamstats current=false last(State) as last_service_status last(_time) as time_of_change by Name | where State!="last_service_status" | eval outage=now()-time_of_change | eval duration=strftime(outage, "%H:%M") | rename State as current_service_status | table time_of_change, Name, last_service_status, current_service_status, duration

All i'm looking to map is a 'state change' event, and the duration that it remained in that state. if there were two times that the service went down three hours apart, the suits would like to see those individual intances in something as simple as three fields. Those fields are:

Outage Start, Outage Stop, Duration.

0 Karma

somesoni2
Revered Legend

Try this (add rename at the end per your need)

your base search  | streamstats current=false last(State) as last_service_status last(_time) as time_of_down by Name,host | where State!=last_service_status AND NOT State="Down" | streamstats current=false last(_time) as time_of_up by Name,host  | where isnotnull(time_of_up) | eval duration=time_of_up - time_of_down | convert ctime(time_of_*) | table host, Name, time_of_*,duration
0 Karma

somesoni2
Revered Legend

Could you post some sample events?

0 Karma

tmarlette
Motivator

I had to find out where is was failing, and it looks like at the | where segment of the query. If I remove a piece of it, I get results, but they are useless, as there is no duration:

sourcetype=WMI:Service Name=Email2db | streamstats current=false last(State) as last_service_status last(time) as time_of_down by Name,host | where State!="Down" | streamstats current=false last(_time) as time_of_up by Name,host | where isnotnull(time_of_up) | eval duration=time_of_up - time_of_down | convert ctime(time_of) | table host, Name, time_of_,duration

0 Karma