Hello,
I've captured one Error in a Business Transaction and AppdynamicsPro showed me the exact call and also the error message. However I can't find a way to get the full execution tree of the errorneous transaction.
I thought if there are errors the transaction is captured or did I misunderstand that?
Attached a screenshot of the error transaction.
Reinhard
Any Ideas?
Hi Reinhard,
Thanks for writing in. You should see the error stack trace under "Error details" screen for the error. Regarding missing call graph section, we see this could happen if the snapshot collected is a partial snapshot and agent will not collect call graph for all error snapshots to avoid performance overhead however you should see few snapshots with call graph collected for same error if error repeated few times.
we request you to drilldown as referred in attached screenshot under troubleshoot -> errors screen, search error transaction by error stack trace as referred in step1.png and step2.png and double click on "Error transaction snapshot" with "full or partial" call graph symbol higglighted in this screen
refer docs: http://docs.appdynamics.com/display/PRO13S/Troubleshoot+Errors
please let us know whether drilling down from troubel shoot -> errors dashboard could help to find error snapshots with call graphs.
Thanks,
Arun
Thanks for the answer!
I that case I only had 3 errors that were detected and unfortunately none of those had a stack trace captured.
So I must rely on the fact that lots of errors are happening so that I get full details?
Reinhard
Hi Reinhard,
Thanks for posting back. We understood your concern, yes your understanding is right, the reason is agent comes with optimized settings hence will collect additional data when there are high rate of errors/slow requests in JVM/server.
let us know if the following suggestions helps:
1) By default agent will start diagnostic session (snapshots collected during diagnostic session will have full call graphs) if there are more than 10% for slow or error requests
- you can try decreasing the value from 10% to lower value for BT which have issue or at global level as referred in screenshot , but please be sure that this will add overhead if there is high rate of slow/error requests, we suggested this option as there is no much load in your case
2) Enable aggressive snapshot collection checbox option under Configure -> Instrumentation -> call graph settings screen as referred in attached screenshot
Thanks,
Arun
Arun,
these options seem to be counter productive to me...
Say I have an application that runs with higher load and suddely there are errors. Both of these options would cause overhead in order to capture the errors safely (and all of them). So if i run in production and then AppD detects errors it will - these options set - automatically slow down by adding overhead just to capture the errors?
That doesn't seem very reliable for a high load prod environment diagnosis.
Are there any whitepapers how other customers handle this? Full visibility while still maintaining low overhead?
Reinhard
Hi Reinhard,
Thanks for posting back, we understood your concern, we confirm user does not need to enable aggressive snapshot collection checkbox or lower diagnostic settings for errors (having default value as 10%) we suggested earlier as the load in your case is too low where in such cases the overhead is not considerable.
To address the scenario where there is high rate of errors, we confirm that you should see error snapshots with full stack trace collected under Troubelshoot -> errors screen under exceptions and error transactions section
Our agent and controller default settings are configured keeping production environments, hence the aggresive settings are disabled by default.
We see customers use health rules on errors count metric to get email/alerts on violation of health rules, let us know if that helps:
http://docs.appdynamics.com/display/PRO13S/Configure+Health+Rules
http://docs.appdynamics.com/display/PRO13S/Configure+Email+Digests
Ex: error healt rule with critical/warn condition say errors / min > 1 for last 5 mts , if there are errors then you should get alert
and also we request you to confirm whether diagnostic session with default settings triggered with default settings where the snapshots collected in diagnostic session contains full call graphs , and diagnostic session triggers even in low error rate if the scenario is as follows:
- say servlet1 have 4 calls per 1 min for past 2 hours and with default diagnostic session settings
- say due to some issue there is raise in error count at 1 call / min then you should see diagnostic session triggered with error snapshots collected with full stace , let us know if you see any issues
Please let us know if that clarifies your query, also we request to check any error snapshpts collected under error transactions section under error screen.
Thanks,
Arun