Splunk AppDynamics

Calculate uptime/downtime per day

Maria_Garcia
Path Finder

How to calculate the uptime/downtime based on the transactions response?

If the interface is consider as down when 5 consecutive requests not replied within a total of 30 seconds...how to create a rule with this condition?

The time is clear but the number of attempts? How to indicate that value on a condition rule please?

Thanks 

Labels (3)
Tags (3)
0 Karma
1 Solution

Anand_Yadav
Explorer

formula would be :

100 - (100*({v2}/{v1}))  , where V2 is Very slow calls value and V1 is total calls ( Average response time (counts) or Sum(Calls per minute ) )

Question : the sum of the response time of those 3 transactions is higher than 30 secs.

Answer :  I mentioned earlier that create a business transaction group And create a dashboard and health rule for it.

Select BT group in health rule and dashboard.you can use this formula in dashboard to get Availability %      100 - (100*({v2}/{v1}))

And create health rule for BT group on Sum(Avg Response Time) > 30 secs , select 5 out of 10 mins or whatever you want in health rule.

if you want to discuss more,please call me.

Thanks

+91 9540642389

^ Edited by @Ryan.Paredez Be sure to read the entire thread for more context about the problem and solution. 

View solution in original post

0 Karma

millerep
Contributor

If I am understanding your question correctly, then you're trying to setup a health rule that triggers if your agent doesn't report in 5 times in a row over a 30 second period. Since the agents only report in every minute, I don't think this will be possible unless AppD adds a "ping" feature which allows users to set their own intervals (a feature I'd love to have too for better Server HealthCheck Monitoring). Your best bet may be to set your health rule to look at the past 1 minute's worth of data and report 1 minute after that if the Agent Availability % drops below 1 or has No data. This won't give you a 30 second response time, but a 120 second one, so you'll have to decide if that extra 90 seconds is acceptable or not.

0 Karma

Maria_Garcia
Path Finder

Thanks Eric,

Sorry let me explain better.

We have 3 business transactions that need to be executed in order to complete a login.

The requiremente is:

- if there are 5 attempts of failed login, then there is an application downtime

- an attempt is considered as failed if the business transaction spends more than 30seconds

I hope this clarify the issue...

Thanks

0 Karma

Anand_Yadav
Explorer

We have 3 business transactions that need to be executed in order to complete a login.

The requiremente is:

- if there are 5 attempts of failed login, then there is an application downtime

- an attempt is considered as failed if the business transaction spends more than 30seconds

I hope this clarify the issue...

Answer: I am assuming that you need it from real user monitoring not from synthetic (it can be easily available from synthetic).

you can define threshold at application level for slow OR very slow transactions 30 secs.

Now group them all three transaction as "Login" , now create a dashboard for it. Select widget "metric" then select particular application as datasource then select metric category "custom"

Now select metric expression from Select a Metric field .

Now select Two metrics as mentioned below:
Login Business transaction group : V2 slow calls metric - Value  &  V1 Average Response Time (ms) - Count

Use below formula , it will give you Availability % for Login BT group :
 100 - (100*({v2}/({v1}+{v2})))

Thanks

Maria_Garcia
Path Finder

Hi,

Sorry but I do not undersatnd why  V2 (slow calls metric - Value)  is added to   V1 (Average Response Time (ms) - Count).

You would be adding the number of slow calls to the number of times the agent collected the metric. But those slow calls should be already included in the V1 count... am I right?

Thanks 

0 Karma

Anand_Yadav
Explorer

Correct , my bad.

thanks for correcting it 🙂

0 Karma

Maria_Garcia
Path Finder

Thanks,

The information was very useful but it is not exactly what we need.

We have 2 transactions that compose the Login action. What we need to calculate is the uptime time of this servic.

Downtime is considered if the following condition occurs more than 5 times:

- the sum of the response time of those 3 transactions is higher than 30 seg

Could you please kindly provide us support on this?

Thanks

Anand_Yadav
Explorer

formula would be :

100 - (100*({v2}/{v1}))  , where V2 is Very slow calls value and V1 is total calls ( Average response time (counts) or Sum(Calls per minute ) )

Question : the sum of the response time of those 3 transactions is higher than 30 secs.

Answer :  I mentioned earlier that create a business transaction group And create a dashboard and health rule for it.

Select BT group in health rule and dashboard.you can use this formula in dashboard to get Availability %      100 - (100*({v2}/{v1}))

And create health rule for BT group on Sum(Avg Response Time) > 30 secs , select 5 out of 10 mins or whatever you want in health rule.

if you want to discuss more,please call me.

Thanks

+91 9540642389

^ Edited by @Ryan.Paredez Be sure to read the entire thread for more context about the problem and solution. 

0 Karma

Maria_Garcia
Path Finder
Thanks a lot. Just as confirmation, the response time metric displayed for a group... is the sum of all the response times or it is an average? Thanks


0 Karma

Anand_Yadav
Explorer

Yes.It will be a sum of all three Avergae response time.

You may also validate it by adding them manually for a given time of interval.

sum Total accumulated value for the metric over the selected time period.

Thanks

0 Karma

Maria_Garcia
Path Finder

Thanks,

Then we should obtain the same result if we use one of this cases indistinctly (taking into account that the service that is composed by 3 Business Transactions) for creating a rule:

A) metric expression

{value(average response time BT1)}+{value(average response time BT2)}+{value(average response time BT3)}

B) group metrics (including teh 3 BTs on a group)

value(average response time group)

Am I right please?

Thanks

0 Karma

Michael_Batchel
New Member

In reference to the daily uptime.  I am interested in creating a report that will expand over 30 days however i need the report to show.

1st  100 % uptime

2nd 100% uptime

3rd 99% uptime.

etc 

then maybe an average uptime at the end of the month if possible.

I need this for a certain URL. I am sure it starts with a dashboard just need a little guidance.  

0 Karma

Maria_Garcia
Path Finder

Hi,

Please kindly confirm how to proceed.

Thanks

0 Karma

Maria_Garcia
Path Finder

Which should be then the right expression to be used?

 100 - (100*({v2}/({v1}))

0 Karma

Maria_Garcia
Path Finder

Thanks Anand,

We cannot use real user monitoring, it is not available for us.

Regarding the formule, sorry could you please kindly clarify it?

 V2 slow calls metric - Value 

 V1 Average Response Time (ms) - Count

0 Karma

millerep
Contributor

Ahh that's a bit more complicated. I'd have to think about the first one, but off the top of my head you may be able to open your Application in the Controller GUI, then go to "Configuration" on the left-hand menu then click on "Instrumentation" then "Error Detection" and maybe set some sort of Error using a redirect or an HTTP Code (Maybe 405?), but I'm not sure how your app is designed or your "login" call works. (Error detection documentation can be found here: https://docs.appdynamics.com/display/PRO45/Error+Detection). I'm not sure though I'll have to play with it to see.

As for your second issue, you can do that fairly easier, by creating a Health Rule for that application, using the "Business Transaction Performance (load, response time, slow calls, etc)" Health Rule Type, then select the exact Business Transaction you want, and create a "Critical Condition" using a metric value of > 30 for response time. Then just create a Policy and Action to trigger that rule. Also, depending on how your application is setup, you may be able to do a 2for1 by adding multiple conditions in the Health Rule set to use an "Any" condition to trigger off your 5 failed login attempts to, but it'll probably use an "errors per minute" metric instead.

Get Updates on the Splunk Community!

New This Month - Splunk Observability updates and improvements for faster ...

What’s New? This month, we’re delivering several enhancements across Splunk Observability Cloud for faster and ...

What's New in Splunk Cloud Platform 9.3.2411?

Hey Splunky People! We are excited to share the latest updates in Splunk Cloud Platform 9.3.2411. This release ...

Buttercup Games: Further Dashboarding Techniques (Part 6)

This series of blogs assumes you have already completed the Splunk Enterprise Search Tutorial as it uses the ...