Monitoring Splunk

high CPU when idle

Michael
Contributor

Using RHEL6 on a 12 core, 32G RAM, relatively idle server (it runs backups at night), running 4.3.3. Splunk currently has ONE input (/var/log/), is forwarding everything, and not keeping a local copy. We have an enterprise license, this is acting as a slave license client.

I'm sitting here with a shell running 'top' and a browser window at the 'Data inputs'. I've had to disable all the inputs to get the CPU for splunkd to STOP consuming 100%. I can toggle the data input to /var/log to 'enable' -- CPU goes to 100. I disable it, then CPU goes to minimal (0 or .3). Back and forth I do this, to test the cause-and-effect.

There's nothing special in /var/log/ -- in fact there's no new activity going on at all. The logs under /opt/splunk/var/log/splunk/ are quiet except for the occasional INFO entry from metrics.log. Even when a directory input is enabled (and CPU goes to 100+) the worst thing logged was an occasional WARN that said something to the effect of an invalid file in the directory because it was binary.

I've seen this on other systems, but attributed it to optimizations or just busy machines; this is not happening here.

Ideas?
Thanks,

Tags (1)
0 Karma

Michael
Contributor

Solved it by the old Windows trick: uninstalling and re-installing. Corrupt something, somewhere?

I opened a case, and after a condescending reply from their tech support that told me I was digesting .gz files and such, they pointed me to the on-line documentation on how to edit the inputs.conf file. Admittedly, I did have it miss-configured initially, but corrected it days ago. They overlooked the fact that I disabled all inputs during testing, and could enable/disable the /var/log on the local machine to duplicate the problem each time -- standard stuff in the /var/log -- no .gz files. Also confirmed the inputs.conf file (and sent them a copy) only had this and other /var/log sources in it (again, all disabled during testing).

Go figure...anywho, fixed now.

0 Karma

RicoSuave
Builder

It would be best to file a case for this one and upload a diag so we may look at your logs, among other things.

0 Karma

Michael
Contributor

It was just updated and rebooted this morning...

(that was a good answer though!)

0 Karma

jonuwz
Influencer

what's the uptime on the box - if its not been rebooted since the leap second addition and you use ntp, that'll cause very high splunkd usage. Google for leap second linux kernel - there's a simple fix by stopping ntp and manually setting the date.

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...