AppD Archive

How do I troubleshoot the Events Service?

Georgiy_Chigric
Engager

For Internal AppDynamics Audiences

What methods can I use to iterate Events Service settings?

 

Table of Contents

Is the Events Service dropping events?

If the Events Service is dropping events, determine why:

  • Is Events Service CPU-bound, memory-bound, I/O-bound, or some combination of these?
  • Under what loads is the Events Service losing events?

    Events Service is losing events... Then...
    Essentially all the time T-shirt size chosen is clearly too small
    During peak load times Determine whether or not losing some events is tolerable

Sometimes the analytical value of the aggregate of events matters more than any single event. In these cases, dropping some events may be fine.

How can I use the KPIs to troubleshoot?

What are the KPIs telling you? Review them to see:
  • Is CPU running hot a lot?
  • Is memory usage too high?
  • Is garbage collection happening too frequently?
  • Is I/O inadequate?

What is the best scaling response to a deficiency?

Is the best response to a deficiency to scale vertically, or to scale horizontally? If you are CPU-bound, can you just scale up the CPU side of it? If you don't have the ability to do that, you can scale horizontally—just add more nodes.

This sort of reasoning applies to deficiencies in any KPI or criterion.

The answer is not always to scale up—you may discover that you are over-provisioned. In that case, you can scale down.

What changes are happening over time?

Sizing must be an iterative process. The sizing that you come up with initially might be right over the longer term, or it might not. Try to get a sense of how similarly or differently traffic is behaving as time goes on.

How does the infrastructure of a given deployment affect performance?

Bear in mind that the sizing estimates you obtain from this series of articles are based on testing one particular set of infrastructure—namely, EC2 instances—which may differ in many ways from the infrastructure found in on-prem deployments.

In the field, you may encounter virtual machines or bare metal—each of which may behave differently even if the specs are superficially similar. For example, a given deployment might be on AWS while another might be on GCP—and different clouds behave differently.

Labels (1)
0 Karma
Get Updates on the Splunk Community!

Index This | What’s a riddle wrapped in an enigma?

September 2025 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with this ...

BORE at .conf25

Boss Of Regular Expression (BORE) was an interactive session run again this year at .conf25 by the brilliant ...

OpenTelemetry for Legacy Apps? Yes, You Can!

This article is a follow-up to my previous article posted on the OpenTelemetry Blog, "Your Critical Legacy App ...