Overall I would say the nmon application is one of the better or possibly the best performance tools we have used for monitoring Linux and AIX servers.
However we do have a minor issue that the forwarders are consuming a relatively large amount of CPU on some servers (relative as in the server has a very, very tiny amount of CPU and nmon uses a large amount of it), we do have servers that run on 0.1 cores of CPU power and without nmon they appear to run just fine.
After some initial tracing/tracking I was able to determine that I believe it's the nmon2csv script which runs on the univeral forwarder to translate the nmon data into multiple csv files which are then indexed. I'm unsure if it's the perl/python or the actual ingestion of a large number of CSV files but it does require some CPU.
On any server with a reasonable amount of CPU, eg. over 1 core of entitled CPU, then it's no issue, but on the smaller servers it can be using 30-50% of the allocated CPU just to do this.
Determining if it's Splunk ingesting the files or the nmon perl script doesn't help very much here, what I would have liked to do is offset the load so a heavy forwarder can take care of running the nmon2csv script.
My initial idea was to trigger a script to run at the heavy forwarder when it saw the data as per https://answers.splunk.com/answers/114329/transform-log-file-or-field-at-index-time-using-script-pyt... . Since the nmon TA is doing:
invalid_cause = archive
unarchive_cmd = $SPLUNK_HOME/etc/apps/TA-nmon/bin/nmon2csv.sh --mode realtime
sourcetype = nmon_processing
I cannot run this on the remote heavy forwarder.
One option I noticed in the nmon documentation is to use a syslog forwarding solution to get the files over to a remote location, since I'm already running the universal forwarder I'm hoping there is another way to remotely process the files without using rsync or syslog to copy the files around.
Any ideas?
Thanks
Hi !
Thank for your interest in Nmon Performance application, and I am glad you feel it great and useful.
To answer your question, I am very happy to inform you that the CPU overhead consumption will be drastically reduced and constant with the upcoming release 1.3.0.
The new release is currently under testing review for its qualification, the CPU footprint issue has been solved by the implementation of named pipe (fifo files).
Nmon binaries will now write to named pipe instead of regular files, a constant running fifo reader process will retrieve the new data and stream it to nmon2csv parsers.
As the volume of data streamed at each iteration si very small and does not anymore increase over the time, the CPU, I/O and memory cost is minimal and constant.
This new feature and behavior will be available to AIX and Linux, Solaris will be upcoming as well.
The real test currently running have already confirmed the stability and great CPU footprint improvements.
I expect this new release to be available within next weeks, and you are more than welcomed to participate in its validation:
https://github.com/guilhemmarchand/nmon-for-splunk/tree/testing/resources
If you deploy the testing release, you can kill the running nmon process after the upgrade to get immediately the named pipe process to be started.
Besides this, the new release also implements nice new features:
So, it is just a question of a few weeks before the release will be published 😉
Guilhem Marchand
Hi !
Thank for your interest in Nmon Performance application, and I am glad you feel it great and useful.
To answer your question, I am very happy to inform you that the CPU overhead consumption will be drastically reduced and constant with the upcoming release 1.3.0.
The new release is currently under testing review for its qualification, the CPU footprint issue has been solved by the implementation of named pipe (fifo files).
Nmon binaries will now write to named pipe instead of regular files, a constant running fifo reader process will retrieve the new data and stream it to nmon2csv parsers.
As the volume of data streamed at each iteration si very small and does not anymore increase over the time, the CPU, I/O and memory cost is minimal and constant.
This new feature and behavior will be available to AIX and Linux, Solaris will be upcoming as well.
The real test currently running have already confirmed the stability and great CPU footprint improvements.
I expect this new release to be available within next weeks, and you are more than welcomed to participate in its validation:
https://github.com/guilhemmarchand/nmon-for-splunk/tree/testing/resources
If you deploy the testing release, you can kill the running nmon process after the upgrade to get immediately the named pipe process to be started.
Besides this, the new release also implements nice new features:
So, it is just a question of a few weeks before the release will be published 😉
Guilhem Marchand
I ran some testing and I'm seeing approximately 1/2 the CPU used by the Splunk process on a single AIX machine compared to previously with the new TA-nmon version!
Great work as always!
Thank you 😉
That's great new.
An update has been done tonight to correct the last issues on the new release.
It is very likely to be ready for final qualification. Feel free if you observe any issue.
Regards,
Guilhem