Splunk Enterprise

How to calculate availability of a API?

sunny_87
New Member

Hello Splunkees,

I have a requirement where I need to calculate the availability or uptime percentage of some Critical APIs. We ingest those API logs in Splunk and it tells us about the throughput, latency and HTTP status codes.

Is there a way to calculate the availability of any API using these metrics? I mean something like calculating the success and failure rate and then based on that come up with a number to say how much available my API is.

Does anyone have any basic query which can calculate that? 

I have created something like below to calculate the success and failure rates -

 

index=myapp_prod sourcetype="service_log" MyCriticalAPI Status=200 
| timechart span=15m count as SuccessRequest 
| appendcols 
    [ search index=myapp_prod sourcetype="service_log" MyCriticalAPI NOT Status=200
    | timechart span=15m count as FailedRequest] 
| eval Total = SuccessRequest + FailedRequest
| eval successRate = round(((SuccessRequest/Total) * 100),2) 
| eval failureRate = round(((FailedRequest/Total) * 100),2)

 

 

Labels (2)
Tags (3)
0 Karma

richgalloway
SplunkTrust
SplunkTrust

There's a difference between "availabilty" and "success rate".  The sample logs appear to show only responses, from which we can calculate the success rate.  You've done that.

Availability, however, must include all of the times a request was sent by the user and NOT received by the server so there is nothing in the logs for it.  That includes times when the application was down or the network too congested, etc.  We can't compute that based on what we have now.

If the application logs its own state then we can compute uptime based on how long it spends in something other than the "Up" or "Ready" state.

---
If this reply helps you, Karma would be appreciated.
0 Karma

sunny_87
New Member

Hi @richgalloway Thank you for the detailed response. That makes total sense and that's why I wanted to ask that question.

I was thinking to do something like this - Calculate success rate and failure rate and let's say success rate for last 15 minutes comes as 99% and failed 1% can we say that the service/API was UP for 99% of time. I know it's not exactly the uptime because the service may be UP but failing for some or other reason like network issues.

Can we calculate something like this? Based on the success percentage, can we determine it's uptime. It's NOT exact calculation but at least will give something. We can also include average response time in the calculation.

what's your view?

0 Karma

richgalloway
SplunkTrust
SplunkTrust

The only assumption we can make about uptime based on success rate is that every time a status code is returned then the application must be up.  The converse is not true - if no status code is returned then we don't know if the application is down or if no request was received.

---
If this reply helps you, Karma would be appreciated.
0 Karma

sunny_87
New Member

Thanks @richgalloway How do I proceed and get that single number percentage based on assumption of status codes.

This is the status code values looks like for last 24 hours -

Values

Count% 
20064,26099.272%
 
4004710.728%
0 Karma

richgalloway
SplunkTrust
SplunkTrust

Here's one possible query.

index=myapp_prod sourcetype="service_log" MyCriticalAPI Status=*
| eventstats count as Total
| stats count as Count, values(Total) as Total by Status
| eval Pct = Count * 100 / Total
| table Status Count Pct
---
If this reply helps you, Karma would be appreciated.
0 Karma
Get Updates on the Splunk Community!

Built-in Service Level Objectives Management to Bridge the Gap Between Service & ...

Wednesday, May 29, 2024  |  11AM PST / 2PM ESTRegister now and join us to learn more about how you can ...

Get Your Exclusive Splunk Certified Cybersecurity Defense Engineer Certification at ...

We’re excited to announce a new Splunk certification exam being released at .conf24! If you’re headed to Vegas ...

Share Your Ideas & Meet the Lantern team at .Conf! Plus All of This Month’s New ...

Splunk Lantern is Splunk’s customer success center that provides advice from Splunk experts on valuable data ...