AppD Archive

number of errors per min

CommunityUser
Splunk Employee
Splunk Employee

Hi,

i've got the issue that the number of errors per min on application level is much lower than on tier level. Shouldn't it be higher?

For exampel the number of errors per minute on application is 35 and on on tier level (on a specific tier) it is 155 errors per minute. How can that be? Which errors are counted as errors on application level and which as tier level errors?

thx for your help 

Regards 

Wasim

0 Karma

CommunityUser
Splunk Employee
Splunk Employee

errors per minute is an average. Other tiers in your application probably have lower error counts, so when this is averaged across the tiers, it is a lower number. 

Errors are counted the same across the application. Error detection is described here: https://docs.appdynamics.com/display/PRO40/Configure+Error+Detection

0 Karma

CommunityUser
Splunk Employee
Splunk Employee

Thank you for your answer. Are all the values from errors per min like obs sum count on application level averages?

And is the value of errors per min on tier level also an average of the values from the underling nodes?

Regards,

Wasim

0 Karma

CommunityUser
Splunk Employee
Splunk Employee

Hi, it depends on the metric. Somerare averages and some are sums and so on. metrics are described here: https://docs.appdynamics.com/display/PRO40/Metrics+Reference

For example, quoting from that page specifically about what you can see in the Metric Browser

"For most types of metrics in the browser, you can click any of the points in the graph to view more information about the metric observed at that point in time. The information shown includes the metric identifier, date and time of the observation, along with any of the following values relevant to the metric:

  • Obs (observed value): the average of all data points seen for that interval. For the Percentile Metric for the App Agent for Java, this is the percentile value. For a cluster or a time rollup, this represents the weighted average across nodes or over time. 
  • Min: the minimum data point value seen for that interval
  • Max: the maximum data point value seen for that interval
  • Sum: the sum of all data point values seen for that interval. For the Percentile Metric for the App Agent for Java, this is the product of the percentile value multiplied by the Count.
  • Count: Number of times different the metric was reported."

Values also depend on the time range you are using. This is described here; https://docs.appdynamics.com/display/PRO40/Metric+Data+Display+in+Graphs

0 Karma

CommunityUser
Splunk Employee
Splunk Employee

H, thanks for your replay. The question i am trying to find an answer is a different one. If i have one application consisting of multiple tiers, each with two nodes.  Is the errors per min metric on tier level an average value of the values from the underlying nodes. And is the errors per min metric on application level the average value of the values from the underlying tiers? For example if node a has 10 errors per min and node b has 30 errors per min would the value for the errors per min for the associated tier be 20 errors per min? Is it the same on the application level? And is this schema woriking for all the metrics from the metric browser, e.g. like number of slow calls? This was about the value at exactly one instant of time. If i have a custom dashboard with a metric field showing the sum value of errors per min on application level for the time range of 15 mins. Is this value the avarage value of all the sum values (in the same time range) from the underlying tiers?

Regards

Wasim

0 Karma

Arun_Dasetty
Super Champion

Hi Wasim,

 

You observation is right given that we support there are two different cases here errors per minute at BT level will not match tier level as errors like 500 http server error will not be part of BT but part of tier or node. the other case is error raised in continuing tiers and the error count at continuing tiers will not sum up to app level errors per minute and this is standard behavior across systems and is by design, we see 132 errors at continuing tier but 45 errors at app level which matches origniating tier or tier having BTs 

We suspect this could be the scenario in your case, refer screenshot for similar case in reference to your scenario check if that answers your query on app and tier level aggregation of metrics, in below case ecommerce tier is originating tier that is making JMS (async) calls to other tier inventory/order server and downstream tier errors are not added to app level becuase app level metric represent the count of BT, error, ART , calls per minutes for business transaction discovered and BTs for downstream tier will not discovered as part of correlation feature

image.png

image.png

Regards,

Arun

0 Karma
Get Updates on the Splunk Community!

See just what you’ve been missing | Observability tracks at Splunk University

Looking to sharpen your observability skills so you can better understand how to collect and analyze data from ...

Weezer at .conf25? Say it ain’t so!

Hello Splunkers, The countdown to .conf25 is on-and we've just turned up the volume! We're thrilled to ...

How SC4S Makes Suricata Logs Ingestion Simple

Network security monitoring has become increasingly critical for organizations of all sizes. Splunk has ...