After upgrading to 6.5.0, why is there a runaway s...

rgiles · ‎10-26-2016

After upgrading to 6.5.0 from 6.4.3 on RHEL5 x86_64-bit, we're noticing a single runway splunkd process chewing up an entire CPU. It appears to be doing "nothing", according to strace:

open("/etc/fstab", O_RDONLY) = 5
fstat(5, {st_mode=S_IFREG|0644, st_ize=641, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2aebab606000
lseek(5, 0, SEEK_SET) = 0
read (5, "/dev/sda1 ... (contents of fstab") ... = 643
read (5, "", 4096) = 0
close(5)
munmap(0x2aebab606000, 4096) = 0
stat("/dev", {st_mode=S_IFDIR|0755, st_size=4080, ...}) = 0
stat("/", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
statfs("/dev", {f_type=0x1021994, f_bsize=4096, f_blocks=2054514, f_bfree=2054485, f_bavail=2054485, f_files=2054514, f_ffree=2054227, f_fsid={0, 0}, f_namelen=255, f_frsize=4096}) = 0
statfs("/dev", {f_type=0x1021994, f_bsize=4096, f_blocks=2054514, f_bfree=2054485, f_bavail=2054485, f_files=2054514, f_ffree=2054227, f_fsid={0, 0}, f_namelen=255, f_frsize=4096}) = 0
open("/etc/fstab", O_RDONLY) = 5
fstat(5, {st_mode=S_IFREG|0644, st_ize=641, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2aebab606000
lseek(5, 0, SEEK_SET) = 0
read (5, "/dev/sda1 ... (contents of fstab") ... = 643
read (5, "", 4096) = 0
close(5)
...

...and over and over again. Haven't noticed anything in any of the Splunk logs (/opt/splunk/var/log/splunk/*) spiraling out of control.

Splunk Universal Forwarder 6.5.0 on RHEL5 x86_64 doesn't appear to exhibit this behavior.

Any ideas out there?

nandersson_splu · ‎03-14-2017

If you are seeing the $SPLUNK_HOME/bin/splunkd instrument-resource-usage -p 8089 is taking the CPU.
This maybe is related to SPL-133720 "splunkd instrument-resource-usage process uses one full CPU core after upgrade to 6.5.1 on Centos 5"

This is planned to be fixed in 6.5.3, this is still subject to change by splunk engineering.

chriswilkes33 · ‎11-30-2016

Did you try running strace on the process?

internet_team · ‎03-09-2017

Hello,
we are facing the same issue with
- splunk-6.5.2-67571ef4b87d-linux-2.6-x86_64-manifest
- Red Hat Enterprise Linux Server release 5.11 (Tikanga)
1 cpu is fully loaded with Splunk process, and strace also show that pid loop on reading /etc/fstab

geoeldsul · ‎11-10-2016

having the same issue with identical setup i.e. Splunk 6.5 on a RHEL 5.8 64 Bit System.

geoeldsul · ‎11-30-2016

Mine was not actually a single runaway process. In using top I forgot about the ability to expand it to show each individual CPU. After doing this I found no one CPU being used to its max. The high cpu usage as a value showing the total of multiple CPUs.

After upgrading to 6.5.0, why is there a runaway splunkd process using up an entire CPU?

Can’t make it to .conf25? Join us online!

Can’t Make It to Boston? Stream .conf25 and Learn with Haya Husain

Splunk Lantern’s Guide to The Most Popular .conf25 Sessions

Unlock What’s Next: The Splunk Cloud Platform at .conf25

Are you a member of the Splunk Community?

After upgrading to 6.5.0, why is there a runaway splunkd process using up an entire CPU?

Can’t make it to .conf25? Join us online!

Can’t Make It to Boston? Stream .conf25 and Learn with Haya Husain

Splunk Lantern’s Guide to The Most Popular .conf25 Sessions

Unlock What’s Next: The Splunk Cloud Platform at .conf25