How to calculate downtime based on the amount of r...

Norling80 · ‎06-01-2015

Hi guys. I want to be able to calculate downtime based on the amount of requests that an Application server processes. The downtime is calculated based on the following rules.

Choose a time-span 30 min before and 30 min after the actual downtime.
Calculate the average amount of events based on the top 20 results i.e the 20 minutes with the most amount of processed requests.
Cassify all events as downtime that has 80% or below of the average described in step 2 above.

Below is an example of the result I want to calculate downtime on:

yannK · ‎06-13-2015

Here is my method to get the top 80% count, using the percentile top 80% counts, and qualify every minute as up or downtime based on this value.

index=_internal source=*web* req_time =*
|  bucket _time span=1m | stats count by _time
| eventstats perc80(count) AS maxperc80 
| eval status=if(count < maxperc80, "down", "up")

You probably want to add some sort of count of consecutive durations and exclude the outliers
Then do the sum of the "down" minutes.

| stats count by status

Archana21 · ‎06-22-2015

...|top 20 status| stats avg(count)

Norling80 · ‎06-22-2015

hi, one more things. how do we add step number 2 above to the search where we take the average of the top 20 results.

woodcock · ‎06-08-2015

I know this is not what you are asking but, based on your example which shows an obvious 100% (full vs. partial) outage, why would you not use something like this:

... | streamstats current=f  latest(_time) AS prevEventTime latest(_raw) AS prevEvent | eval downtime = _time - _prevEventTime | where downtime > 100

Norling80 · ‎06-08-2015

Thanks for your input. I have something similar in-place already, however point number 2 above is an important part of the search to be able to calculate the downtime in a proper way.

How to calculate downtime based on the amount of requests an application server processes?

Enterprise Security Content Update (ESCU) | New Releases

Why am I not seeing the finding in Splunk Enterprise Security Analyst Queue?

Index This | What are the 12 Days of Splunk-mas?