After upgrading to 6.5.0 from 6.4.3 on RHEL5 x86_64-bit, we're noticing a single runway splunkd process chewing up an entire CPU. It appears to be doing "nothing", according to strace:
open("/etc/fstab", O_RDONLY) = 5
fstat(5, {st_mode=S_IFREG|0644, st_ize=641, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2aebab606000
lseek(5, 0, SEEK_SET) = 0
read (5, "/dev/sda1 ... (contents of fstab") ... = 643
read (5, "", 4096) = 0
close(5)
munmap(0x2aebab606000, 4096) = 0
stat("/dev", {st_mode=S_IFDIR|0755, st_size=4080, ...}) = 0
stat("/", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
statfs("/dev", {f_type=0x1021994, f_bsize=4096, f_blocks=2054514, f_bfree=2054485, f_bavail=2054485, f_files=2054514, f_ffree=2054227, f_fsid={0, 0}, f_namelen=255, f_frsize=4096}) = 0
statfs("/dev", {f_type=0x1021994, f_bsize=4096, f_blocks=2054514, f_bfree=2054485, f_bavail=2054485, f_files=2054514, f_ffree=2054227, f_fsid={0, 0}, f_namelen=255, f_frsize=4096}) = 0
open("/etc/fstab", O_RDONLY) = 5
fstat(5, {st_mode=S_IFREG|0644, st_ize=641, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2aebab606000
lseek(5, 0, SEEK_SET) = 0
read (5, "/dev/sda1 ... (contents of fstab") ... = 643
read (5, "", 4096) = 0
close(5)
...
...and over and over again. Haven't noticed anything in any of the Splunk logs (/opt/splunk/var/log/splunk/*) spiraling out of control.
Splunk Universal Forwarder 6.5.0 on RHEL5 x86_64 doesn't appear to exhibit this behavior.
Any ideas out there?
If you are seeing the $SPLUNK_HOME/bin/splunkd instrument-resource-usage -p 8089 is taking the CPU.
This maybe is related to SPL-133720 "splunkd instrument-resource-usage process uses one full CPU core after upgrade to 6.5.1 on Centos 5"
This is planned to be fixed in 6.5.3, this is still subject to change by splunk engineering.
Did you try running strace on the process?
Hello,
we are facing the same issue with
- splunk-6.5.2-67571ef4b87d-linux-2.6-x86_64-manifest
- Red Hat Enterprise Linux Server release 5.11 (Tikanga)
1 cpu is fully loaded with Splunk process, and strace also show that pid loop on reading /etc/fstab
having the same issue with identical setup i.e. Splunk 6.5 on a RHEL 5.8 64 Bit System.
Mine was not actually a single runaway process. In using top I forgot about the ability to expand it to show each individual CPU. After doing this I found no one CPU being used to its max. The high cpu usage as a value showing the total of multiple CPUs.