AppD Archive

App Dynamics agent not logging properly?

CommunityUser
Splunk Employee
Splunk Employee

Our dedicated server hosting provider is telling us that the App Dynamics agent may not be properly configured and need some insight on how to resolve the issues their support team identified below:

First, I don't believe appdynamics php agent is logging correctly. Since increasing the verbosity of the php error logs, I"ve seen the following repeated many times:

[24-Oct-2014 12:09:18] WARNING: [pool dermproc] child 17078 said into stderr: "log4cxx: Could not create an Appender. Reported error follows."

[24-Oct-2014 12:09:18] WARNING: [pool dermproc] child 17078 said into stderr: "log4cxx: IO Exception : status code = 13"

[24-Oct-2014 12:09:18] WARNING: [pool dermproc] child 17078 said into stderr: "log4cxx: No appender named [main] could be found."

[24-Oct-2014 12:09:18] WARNING: [pool dermproc] child 17078 said into stderr: "log4cxx: No appender could be found for logger (agent.AgentKernel)."

[24-Oct-2014 12:09:18] WARNING: [pool dermproc] child 17078 said into stderr: "log4cxx: Please initialize the log4cxx system properly."

log4cxx is a software maintained by the apache foundation described as follows:

"Log4cxx is a fast and flexible framework for logging application messages. It is the C++ port of the log4j Java logging library. "

The config file for the appdynamics php agent is located at /etc/php.d/appdynamics_agent.ini and its contents are as follows:

[AppDynamics Agent]

extension = appdynamics_agent.so

agent.log4cxx_config = /usr/lib/appdynamics-php5/php/conf/appdynamics_agent_log4cxx.xml

agent.php_agent_root = /usr/lib/appdynamics-php5

The above configuration tells appdynamics php agent to load the log4cxx configuration from /usr/lib/appdynamics-php5/php/conf/appdynamics_agent_log4cxx.xml which has the following contents:

<?xml version="1.0" encoding="UTF-8" ?>

<log4j:configuration xmlns:log4j="http://jakarta.apache.org/log4j/" debug="false">

<appender name="main" class="org.apache.log4j.RollingFileAppender">

 <param name="file" value="/agent.log"/>  <param name="HeaderOnlyInNewFile" value="true"/>  <param name="MaxFileSize" value="20MB"/>  <param name="MaxBackupIndex" value="5"/>  <layout class="org.apache.log4j.PatternLayout">

  <param name="ConversionPattern" value="%d{yyyy-MM-dd HH:mm:ss.SSS z} %-5p %X{pid} [%c{2}] %m%n" />

  <param name="HeaderPattern" value="AppDynamics PHP Agent %X{version} PHP-%X{phpVersion} %X{pid}%n" />  </layout> </appender>

<root>

  <priority value="info" />

  <appender-ref ref="main"/>

</root>

</log4j:configuration>

This indicates that the php agent cannot log as the software is initializing the log4cxx framework incorrectly. This could be due to the config file being incorrect, or an issue with the software itself. You may want to contact appdynamics regarding this to see what needs to change to get it to log properly.

However that brings us to the much more important findings: based on the data gathered recently by myself and a few collegues, we believe that the AppDynamics php agent and preferrably the monitoring service altogether should be disabled on the server.

Here's what we've found to come to this conclusion:

Often tracing the system calls for php processes during these issues shows the following:

--- SIGQUIT (Quit) @ 0 (0) ---

close(0) = 0

socket(PF_FILE, SOCK_STREAM, 0) = 0

rt_sigreturn(0x1) = 202

futex(0x7fd3195d1890, FUTEX_WAIT_PRIVATE, 1, NULL) = ? ERESTARTSYS (To be restarted)

--- SIGQUIT (Quit) @ 0 (0) ---

close(0) = 0

socket(PF_FILE, SOCK_STREAM, 0) = 0

rt_sigreturn(0x1) = 202

futex(0x7fd3195d1890, FUTEX_WAIT_PRIVATE, 1, NULL) = ? ERESTARTSYS (To be restarted)

These calls are repeated in a cycle that never ends.

Using gdb to get a stack trace of this process shows the following:

#0 0x00007fd3193386e2 in __unregister_atfork () from /lib64/libc.so.6

#1 0x

00007fd311dc7336 in ?? () from /usr/lib64/php/modules/appdynamics_agent.so

#2  0x0000000000000064 in ?? ()

#3  0x0000000000000000 in ?? ()

The "futex" call seen in the strace of the process is to implement locking in linux to avoid situations where multiple processes are trying to alter the exact same data at the same time and similar. In this case the process is waiting for a certain value to be returned at a certain address so it can continue. In the gdb output the "unregister_atfork" function is being called which has to do with process forking and is referencing the appdynamics_agent.so module.

During the issue today I took a look at the java appdynamics proxy process and it was issuing many system calls like the following:

17110 1414168899.049899 futex(0x7ffad0000ab4, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1414168899, 99880000}, ffffffff) = -1 ETIMEDOUT (Connection timed out) <0.050109>

17110 1414168899.100095 futex(0x7ffb400bfe28, FUTEX_WAKE_PRIVATE, 1) = 0 <0.000035>

17110 1414168899.100175 futex(0x7ffad0000ab4, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1414168899, 150156000}, ffffffff) = -1 ETIMEDOUT (Connection timed out) <0.050107>

17110 1414168899.150369 futex(0x7ffb400bfe28, FUTEX_WAKE_PRIVATE, 1) = 0 <0.000054>

17110 1414168899.150469 futex(0x7ffad0000ab4, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1414168899, 200451000}, ffffffff) = -1 ETIMEDOUT (Connection timed out) <0.050162>

17110 1414168899.200752 futex(0x7ffb400bfe28, FUTEX_WAKE_PRIVATE, 1) = 0 <0.000076>

17110 1414168899.200978 futex(0x7ffad0000ab4, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1414168899, 250899000}, ffffffff) = -1 ETIMEDOUT (Connection timed out) <0.050101>

17110 1414168899.251208 futex(0x7ffb400bfe28, FUTEX_WAKE_PRIVATE, 1) = 0 <0.000083>

17110 1414168899.251412 futex(0x7ffad0000ab4, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1414168899, 301360000}, ffffffff) = -1 ETIMEDOUT (Connection timed out) <0.050128>

17110 1414168899.301669 futex(0x7ffb400bfe28, FUTEX_WAKE_PRIVATE, 1) = 0 <0.000081>

17110 1414168899.301873 futex(0x7ffad0000ab4, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1414168899, 351822000}, ffffffff <unfinished ...>

17114 1414168899.309486 <... futex resumed> ) = -1 ETIMEDOUT (Connection timed out) <10.000116>

17114 1414168899.309540 futex(0x7ffb402079e0, FUTEX_WAKE_PRIVATE, 1) = 0 <0.000026>

17114 1414168899.309670 futex(0x7ffac0001694, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1414168909, 309642000}, ffffffff <unfinished ...>

Again you'll see the "futex" call but this time the process is issuing "FUTEX_WAKE" for half of the calls which is meant to wake processes in a futex wait state.

I believe this is related to what we've seen so far where connecting to the php-fpm socket returns no data because all the processes become stuck in a futex wait state. This seems to be related to the appdynamics module.

Now that I've disabled the appdynamics php module, I've like to see if this issue continues. If this is to be re-enabled at any time, I would recommend contacting appdynamics to figure out what needs to be done to get the php agent to log correctly so we can see if there are any issues reported by it.

Thanks for your help and any suggestions on how to resolve the hosting providers concerns.

0 Karma
1 Solution

CommunityUser
Splunk Employee
Splunk Employee

Environment Details:

Product Agent version :                                3.8.2.0

Current Controller version:                          3.9.3.0 build 20

Application JVM/CLR Version:                    Java 8 Update 25

Current gent Type or Component:           PHP

Response received from App Dynamics Support regarding our logging issue:

Hey Fred,

Thanks for the detailed description of the issue. The first thing to note is the log4cxx.xml, which is trying to log to /agent.log. Since this root folder is protected from write by the php user, the log4cxx is not able to initialize, and you see the status code 13. This failure to initialize is very likely the cause of the recurring syscall cycle you are seeing. The PHP extension is trying to start itself up over and over, and failing to, based on the lack of permissions to start logging.

From the java proxy side, I suspect the futexes are related to the either the startup cycle from the PHP extension as well, or more generally for the zmq socket code. Does this answer your questions adequately? I'm happy to provide further technical detail to build more understanding of this failure mode.

For now you can edit the /agent.log to be /usr/lib/appdynamics-php5/logs/agent.log. Then verify that this directory is writeable by the php user. Then restart your web server to pick up the changes. Please do let me know how I can be of further assistance.

Note also that the root cause here is a bug in 3.8 line RPM installer which doesn't populate that config file correctly. The long term fix is to deploy the 3.9.x RPM.

Best regards,

Kyle

View solution in original post

0 Karma

CommunityUser
Splunk Employee
Splunk Employee

Hello Fred, 

Thank you for joining the AppDynamics Community. 

Please be advised that someone will assist you shortly. 

To further understand your dilemna, are you an AppDynamics Pro or Lite user? 

Thank you,

Cailtin  

0 Karma

CommunityUser
Splunk Employee
Splunk Employee

Environment Details:

Product Agent version :                                3.8.2.0

Current Controller version:                          3.9.3.0 build 20

Application JVM/CLR Version:                    Java 8 Update 25

Current gent Type or Component:           PHP

Response received from App Dynamics Support regarding our logging issue:

Hey Fred,

Thanks for the detailed description of the issue. The first thing to note is the log4cxx.xml, which is trying to log to /agent.log. Since this root folder is protected from write by the php user, the log4cxx is not able to initialize, and you see the status code 13. This failure to initialize is very likely the cause of the recurring syscall cycle you are seeing. The PHP extension is trying to start itself up over and over, and failing to, based on the lack of permissions to start logging.

From the java proxy side, I suspect the futexes are related to the either the startup cycle from the PHP extension as well, or more generally for the zmq socket code. Does this answer your questions adequately? I'm happy to provide further technical detail to build more understanding of this failure mode.

For now you can edit the /agent.log to be /usr/lib/appdynamics-php5/logs/agent.log. Then verify that this directory is writeable by the php user. Then restart your web server to pick up the changes. Please do let me know how I can be of further assistance.

Note also that the root cause here is a bug in 3.8 line RPM installer which doesn't populate that config file correctly. The long term fix is to deploy the 3.9.x RPM.

Best regards,

Kyle

0 Karma
Get Updates on the Splunk Community!

See just what you’ve been missing | Observability tracks at Splunk University

Looking to sharpen your observability skills so you can better understand how to collect and analyze data from ...

Weezer at .conf25? Say it ain’t so!

Hello Splunkers, The countdown to .conf25 is on-and we've just turned up the volume! We're thrilled to ...

How SC4S Makes Suricata Logs Ingestion Simple

Network security monitoring has become increasingly critical for organizations of all sizes. Splunk has ...