All Apps and Add-ons

Splunk Nmon Application - Is there any nice way to remove the CPU load from the universal forwarders?

gjanders
SplunkTrust
SplunkTrust

Overall I would say the nmon application is one of the better or possibly the best performance tools we have used for monitoring Linux and AIX servers.

However we do have a minor issue that the forwarders are consuming a relatively large amount of CPU on some servers (relative as in the server has a very, very tiny amount of CPU and nmon uses a large amount of it), we do have servers that run on 0.1 cores of CPU power and without nmon they appear to run just fine.

After some initial tracing/tracking I was able to determine that I believe it's the nmon2csv script which runs on the univeral forwarder to translate the nmon data into multiple csv files which are then indexed. I'm unsure if it's the perl/python or the actual ingestion of a large number of CSV files but it does require some CPU.
On any server with a reasonable amount of CPU, eg. over 1 core of entitled CPU, then it's no issue, but on the smaller servers it can be using 30-50% of the allocated CPU just to do this.

Determining if it's Splunk ingesting the files or the nmon perl script doesn't help very much here, what I would have liked to do is offset the load so a heavy forwarder can take care of running the nmon2csv script.

My initial idea was to trigger a script to run at the heavy forwarder when it saw the data as per https://answers.splunk.com/answers/114329/transform-log-file-or-field-at-index-time-using-script-pyt... . Since the nmon TA is doing:

invalid_cause = archive
unarchive_cmd = $SPLUNK_HOME/etc/apps/TA-nmon/bin/nmon2csv.sh --mode realtime
sourcetype = nmon_processing

I cannot run this on the remote heavy forwarder.

One option I noticed in the nmon documentation is to use a syslog forwarding solution to get the files over to a remote location, since I'm already running the universal forwarder I'm hoping there is another way to remotely process the files without using rsync or syslog to copy the files around.

Any ideas?

Thanks

Alerts for Splunk Admins https://splunkbase.splunk.com/app/3796/
Version Control for Splunk https://splunkbase.splunk.com/app/4355/
1 Solution

guilmxm
Influencer

Hi !

Thank for your interest in Nmon Performance application, and I am glad you feel it great and useful.

To answer your question, I am very happy to inform you that the CPU overhead consumption will be drastically reduced and constant with the upcoming release 1.3.0.

The new release is currently under testing review for its qualification, the CPU footprint issue has been solved by the implementation of named pipe (fifo files).
Nmon binaries will now write to named pipe instead of regular files, a constant running fifo reader process will retrieve the new data and stream it to nmon2csv parsers.
As the volume of data streamed at each iteration si very small and does not anymore increase over the time, the CPU, I/O and memory cost is minimal and constant.

This new feature and behavior will be available to AIX and Linux, Solaris will be upcoming as well.

The real test currently running have already confirmed the stability and great CPU footprint improvements.

I expect this new release to be available within next weeks, and you are more than welcomed to participate in its validation:

https://github.com/guilhemmarchand/nmon-for-splunk/tree/testing/resources

If you deploy the testing release, you can kill the running nmon process after the upgrade to get immediately the named pipe process to be started.

Besides this, the new release also implements nice new features:

  • the list of key performance monitors to be parsed is now stores in an external json file, which allows people to customize in an upgrade persistent fashion
  • the new release implements the Nmon external feature, basically this allows you to extend very easily Nmon data with anything you need and that matters for you (command output, shell / perl / Python script, external API calls... whatever you want)
  • some minor issue corrections
  • Availability to generate the performance data in json format instead of legacy csv if you are more interested in saving storage at the indexes level instead of saving licensing cost and best performances. (about 50% more costs in license, 20% less cost in storage)

So, it is just a question of a few weeks before the release will be published 😉

Guilhem Marchand

View solution in original post

guilmxm
Influencer

Hi !

Thank for your interest in Nmon Performance application, and I am glad you feel it great and useful.

To answer your question, I am very happy to inform you that the CPU overhead consumption will be drastically reduced and constant with the upcoming release 1.3.0.

The new release is currently under testing review for its qualification, the CPU footprint issue has been solved by the implementation of named pipe (fifo files).
Nmon binaries will now write to named pipe instead of regular files, a constant running fifo reader process will retrieve the new data and stream it to nmon2csv parsers.
As the volume of data streamed at each iteration si very small and does not anymore increase over the time, the CPU, I/O and memory cost is minimal and constant.

This new feature and behavior will be available to AIX and Linux, Solaris will be upcoming as well.

The real test currently running have already confirmed the stability and great CPU footprint improvements.

I expect this new release to be available within next weeks, and you are more than welcomed to participate in its validation:

https://github.com/guilhemmarchand/nmon-for-splunk/tree/testing/resources

If you deploy the testing release, you can kill the running nmon process after the upgrade to get immediately the named pipe process to be started.

Besides this, the new release also implements nice new features:

  • the list of key performance monitors to be parsed is now stores in an external json file, which allows people to customize in an upgrade persistent fashion
  • the new release implements the Nmon external feature, basically this allows you to extend very easily Nmon data with anything you need and that matters for you (command output, shell / perl / Python script, external API calls... whatever you want)
  • some minor issue corrections
  • Availability to generate the performance data in json format instead of legacy csv if you are more interested in saving storage at the indexes level instead of saving licensing cost and best performances. (about 50% more costs in license, 20% less cost in storage)

So, it is just a question of a few weeks before the release will be published 😉

Guilhem Marchand

View solution in original post

gjanders
SplunkTrust
SplunkTrust

I ran some testing and I'm seeing approximately 1/2 the CPU used by the Splunk process on a single AIX machine compared to previously with the new TA-nmon version!

Great work as always!

Alerts for Splunk Admins https://splunkbase.splunk.com/app/3796/
Version Control for Splunk https://splunkbase.splunk.com/app/4355/
0 Karma

guilmxm
Influencer

Thank you 😉

That's great new.
An update has been done tonight to correct the last issues on the new release.

It is very likely to be ready for final qualification. Feel free if you observe any issue.

Regards,

Guilhem

0 Karma
Did you miss .conf21 Virtual?

Good news! The event's keynotes and many of its breakout sessions are now available online, and still totally FREE!