All Apps and Add-ons

WebSphere 8.5.5 JVM Instrumentation App - Has anyone got it working?

SplunkTrust
SplunkTrust

So I'm testing monitoring IBM WebSphere 8.5.5 with Splunk, and preferably without a 3rd party java agent such as New Relic or CA APM or similar.

However it is proving very difficult, I've tested the JVM instrumentation agent with WebSphere 8.5.5 and I initially hit a class issue as per the answer WebSphere 8.5.9 JVM fails to start with JVM Instrumentation Agent. How can I get an IBM Java 1.6 com...

TL;DR version is has anyone got the JVM instrumentation app working in WAS 8.5.5 ?

For those interested in the details feel free to read on, my initial attempt was to remove org\apache\commons\logging\ and I also tried a few variations of this (for example removing only the clashing Log4JLogger.class file).
This allows the JVM to startup as expected but I do not seem to get any logging from the agent.

In the JVM level I'm using:

-javaagent:/tmp/splunkagent.jar=/tmp/splunkagent.properties

I also tried re-compiling the code that Damien Dallimore has provided on github by switching out the log4j references with:

import org.apache.commons.logging.Log; 
import org.apache.commons.logging.LogFactory; 

    private static Log logger = LogFactory.getLog(SplunkJavaAgent.class);

I also changed splunkagent.properties to:

#ERROR/INFO
agent.loggingLevel=INFO

I also tried setting this:

-javaagent:/home/was/splunkagent.jar=/tmp/splunkagent.properties

That results in:

[1/5/18 18:51:34:935 AEDT] 0000000f SplunkJavaAge E com.splunk.javaagent.SplunkJavaAgent run Error running properties file checker thread :Access denied ("java.io.FilePermission" "/tmp/splunkagent.properties" "read")

So it does log when something is wrong, but I cannot get it to log the rest of the time...

Later in the day I tried replacing all mentions of logger. with System.out.println and System.err.println and this data finally appeared in the native_stdout/stderr log files. I'm further along now but I will need to do more troubleshooting to get this working.

Has anyone got this working in WebSphere?

Also FYI, I tested the JMX/WebSphere add on, beyond the lack of Splunk 7 support the JMX add on did not work with the WebSphere SSL :

java.io.IOException: Failed to retrieve RMIServer stub: javax.naming.NamingException: Error during resolve [Root exception is org.omg.CORBA.TRANSIENT: initial and forwarded IOR inaccessible  vmcid: IBM  minor code: E07  completed: No]

Making the "CSIv2 inbound communications" Transport SSL-supported instead of SSL-required fixes that issue, I then hit the issue that the JMX agent does not work with username/password, or at least not in an WebSphere server as even with valid credentials is advised that an "unauthenticated" user was attempting to retrieving MBean information.

0 Karma

Loves-to-Learn Lots

I then hit the issue that the JMX agent does not work with username/password, or at least not in an WebSphere server as even with valid credentials is advised that an "unauthenticated" user was attempting to retrieving MBean information.

 

What did you do to get around this? 

0 Karma

SplunkTrust
SplunkTrust

I moved on from that company a long time ago and did not get this working 

0 Karma

SplunkTrust
SplunkTrust

More work is required still, it would appear the java 2 security system starts when the JVM starts and is disabled later in the JVM startup (assuming you have java 2 security disabled).

This explains why the JVM agent could not access the filesystem, I tried a few combinations of adding a file into the jar file (was.policy, app.policy) to disable the required security options but I couldn't seem to get it working.
I did get:

grant codeBase "file:/tmp/splunkagent.jar" {
  permission java.security.AllPermission;
};

In the server.policy (profiles/.../properties/server.policy), which gets me past the first issue.

Furthermore, I did also confirm the messages such as:

2018/01/08 16:42:12Error running properties file checker thread :Access denied ("java.io.FilePermission" "/tmp/splunkagent.properties" "read")

Are mostly harmless, when java 2 security does get disabled, the JVM does allow the file to be read, and since the agent keeps trying to read it it eventually works.

Furthermore since the SystemOut/SystemErr logs are starting after the JVM agent starts (at least from my point of view, I initially tried a 60 second delay in the java agent and that just delays the app server from starting by 60 seconds).

Furthermore, in the Splunk Agent if i do:

trace.jmx.configfiles=/tmp/jmx

It appears to be loading the file without an issue, but then I receive no data in Splunk, however if I let it use the jmx.xml inside the jar file (which is identical) it works fine.

Finally, I also found the Splunk app for WAS does things like:

<mbean domain="WebSphere" properties="*,type=ThreadPool" dumpAllAttributes="true" ></mbean>

However I cannot seem to do "dumpAllAttributes", I'll trim the extra spaces and see if that helps...

EDIT: In addition to the above I had to add this argument to the JVM options: -Djavax.management.builder.initial= without this argument it just did not work...

And that stops:

2018/01/08 18:38:10Initialising transport : Failed to load MBeanServerBuilder class com.ibm.ws.management.PlatformMBeanServerBuilder: java.lang.ClassNotFoundException: com.ibm.ws.management.PlatformMBeanServerBuilder

However that seems to prevent MBeans from been available...(or at least it appears to be preventing access to some MBeans...)

0 Karma

SplunkTrust
SplunkTrust

I did resolve the above issue, by replacing this subsection of the code:
agent = new SplunkJavaAgent();

        new java.util.Timer().schedule( 
                new java.util.TimerTask() {
                    @Override
                    public void run() {
                        agent.startAgent(agentArgument, instrumentation);
                    }
                }, 
                60000 
        );          
    } catch (Throwable t) {
        System.err.println(getDateInfo() + "Error starting Splunk Java Agent : " + t.getMessage());
    }

Where the startAgent function contains the various methods that actually run the agent.
The next issue was that the parent thread would exit so various other bits of the code doing parent.isAlive() exited, I removed the isAlive() check and that allowed the JMX to run as expected.

However for an unknown reason the MBeans are super-limited when used via this method, I eventually managed to obtain about 6 MBeans, I even tried with properties/domain set to *, and this still results in minimal data.

I'm unsure why, when connecting remotely via JMX I can see approx 50 MBeans, I can only see a small number internally when I use this method so something is wrong but I'm unsure how to diagnose this further.

This sleeping for 60 seconds trick also resolved my java 2 security issues so that seems to work, I still don't see the logging working as expected but there is no issue reading the properties file anymore...

0 Karma

SplunkTrust
SplunkTrust

I should add that I did disable java 2 security and I also disabled application security (just in case).
At this stage I suspect that something is preventing the JVM agent from reading the properties file but I will do some more testing...

0 Karma