AppD Archive

Error detected - how to get the call tree

CommunityUser
Splunk Employee
Splunk Employee

Hello,

I've captured one Error in a Business Transaction and AppdynamicsPro showed me the exact call and also the error message. However I can't find a way to get the full execution tree of the errorneous transaction.

I thought if there are errors the transaction is captured or did I misunderstand that?

Attached a screenshot of the error transaction.

Reinhard

0 Karma

CommunityUser
Splunk Employee
Splunk Employee

Any Ideas?

0 Karma

Arun_Dasetty
Super Champion

Hi Reinhard,

Thanks for writing in. You should see the error stack trace under "Error details" screen for the error. Regarding missing call graph section, we see this could happen if the snapshot collected is a partial snapshot and agent will not collect call graph for all error snapshots to avoid performance overhead however you should see few snapshots with call graph collected for same error if error repeated few times.

 we request you to drilldown as referred in attached screenshot under troubleshoot -> errors screen, search error transaction by error stack trace as referred in step1.png and step2.png and double click on "Error transaction snapshot" with "full or partial" call graph symbol higglighted in this screen

refer docs: http://docs.appdynamics.com/display/PRO13S/Troubleshoot+Errors

please let us know whether drilling down from troubel shoot -> errors dashboard could help to find error snapshots with call graphs.

Thanks,

Arun

0 Karma

CommunityUser
Splunk Employee
Splunk Employee

Thanks for the answer!

I that case I only had 3 errors that were detected and unfortunately none of those had a stack trace captured.

So I must rely on the fact that lots of errors are happening so that I get full details?

Reinhard

0 Karma

Arun_Dasetty
Super Champion

Hi Reinhard,

Thanks for posting back. We understood your concern, yes your understanding is right, the reason is agent comes with optimized settings hence will collect additional data when there are high rate of errors/slow requests in JVM/server.
let us know if the following suggestions helps:
1) By default agent will start diagnostic session (snapshots collected during diagnostic session will have full call graphs) if there are more than 10% for slow or error requests
  - you can try decreasing the value from 10% to lower value for BT which have issue or at global level  as referred in screenshot , but please be sure that this will add overhead if there is high rate of slow/error requests, we suggested this option as there is no much load in your case

2) Enable aggressive snapshot collection checbox option under Configure -> Instrumentation -> call graph settings screen as referred in attached screenshot

Thanks,

Arun

0 Karma

CommunityUser
Splunk Employee
Splunk Employee

Arun,

these options seem to be counter productive to me...

Say I have an application that runs with higher load and suddely there are errors. Both of these options would cause overhead in order to capture the errors safely (and all of them). So if i run in production and then AppD detects errors it will - these options set - automatically slow down by adding overhead just to capture the errors?

That doesn't seem very reliable for a high load prod environment diagnosis.

Are there any whitepapers how other customers handle this? Full visibility while still maintaining low overhead?

Reinhard

0 Karma

Arun_Dasetty
Super Champion

Hi Reinhard,

Thanks for posting back, we understood your concern, we confirm user does not need to enable aggressive snapshot collection checkbox or lower diagnostic settings for errors (having default value as 10%) we suggested earlier as the load in your case is too low where in such cases the overhead is not considerable.

To address the scenario where there is high rate of errors, we confirm that you should see error snapshots with full stack trace collected under Troubelshoot -> errors screen under exceptions and error transactions section

Our agent and controller default settings are configured keeping production environments, hence the aggresive settings are disabled by default.

We see customers use health rules on errors count metric to get email/alerts on violation of health rules, let us know if that helps:

http://docs.appdynamics.com/display/PRO13S/Configure+Health+Rules

http://docs.appdynamics.com/display/PRO13S/Configure+Email+Digests

Ex: error healt rule with critical/warn condition say errors / min > 1 for last 5 mts , if there are errors then you should get alert

 
and also we request you to confirm whether diagnostic session with default settings triggered with default settings where the snapshots collected in diagnostic session contains full call graphs , and diagnostic session triggers even in low error rate if the scenario is as follows:
- say servlet1 have 4 calls per 1 min for past 2 hours and with default diagnostic session settings

- say due to some issue there is raise in error count at 1 call / min then you should see diagnostic session triggered with error snapshots collected with full stace , let us know if you see any issues

Please let us know if that clarifies your query, also we request to check any error snapshpts collected under error transactions section under error screen.

Thanks,

Arun

0 Karma
Get Updates on the Splunk Community!

See just what you’ve been missing | Observability tracks at Splunk University

Looking to sharpen your observability skills so you can better understand how to collect and analyze data from ...

Weezer at .conf25? Say it ain’t so!

Hello Splunkers, The countdown to .conf25 is on-and we've just turned up the volume! We're thrilled to ...

How SC4S Makes Suricata Logs Ingestion Simple

Network security monitoring has become increasingly critical for organizations of all sizes. Splunk has ...