Getting Data In

After upgrading to 6.5.0, why is there a runaway splunkd process using up an entire CPU?

rgiles
Engager

After upgrading to 6.5.0 from 6.4.3 on RHEL5 x86_64-bit, we're noticing a single runway splunkd process chewing up an entire CPU. It appears to be doing "nothing", according to strace:

open("/etc/fstab", O_RDONLY) = 5
fstat(5, {st_mode=S_IFREG|0644, st_ize=641, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2aebab606000
lseek(5, 0, SEEK_SET) = 0
read (5, "/dev/sda1 ... (contents of fstab") ... = 643
read (5, "", 4096) = 0
close(5)
munmap(0x2aebab606000, 4096) = 0
stat("/dev", {st_mode=S_IFDIR|0755, st_size=4080, ...}) = 0
stat("/", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
statfs("/dev", {f_type=0x1021994, f_bsize=4096, f_blocks=2054514, f_bfree=2054485, f_bavail=2054485, f_files=2054514, f_ffree=2054227, f_fsid={0, 0}, f_namelen=255, f_frsize=4096}) = 0
statfs("/dev", {f_type=0x1021994, f_bsize=4096, f_blocks=2054514, f_bfree=2054485, f_bavail=2054485, f_files=2054514, f_ffree=2054227, f_fsid={0, 0}, f_namelen=255, f_frsize=4096}) = 0
open("/etc/fstab", O_RDONLY) = 5
fstat(5, {st_mode=S_IFREG|0644, st_ize=641, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2aebab606000
lseek(5, 0, SEEK_SET) = 0
read (5, "/dev/sda1 ... (contents of fstab") ... = 643
read (5, "", 4096) = 0
close(5)
...

...and over and over again. Haven't noticed anything in any of the Splunk logs (/opt/splunk/var/log/splunk/*) spiraling out of control.

Splunk Universal Forwarder 6.5.0 on RHEL5 x86_64 doesn't appear to exhibit this behavior.

Any ideas out there?

nandersson_splu
Splunk Employee
Splunk Employee

If you are seeing the $SPLUNK_HOME/bin/splunkd instrument-resource-usage -p 8089 is taking the CPU.
This maybe is related to SPL-133720 "splunkd instrument-resource-usage process uses one full CPU core after upgrade to 6.5.1 on Centos 5"

This is planned to be fixed in 6.5.3, this is still subject to change by splunk engineering.

0 Karma

chriswilkes33
Explorer

Did you try running strace on the process?

0 Karma

internet_team
Explorer

Hello,
we are facing the same issue with
- splunk-6.5.2-67571ef4b87d-linux-2.6-x86_64-manifest
- Red Hat Enterprise Linux Server release 5.11 (Tikanga)
1 cpu is fully loaded with Splunk process, and strace also show that pid loop on reading /etc/fstab

0 Karma

geoeldsul
Explorer

having the same issue with identical setup i.e. Splunk 6.5 on a RHEL 5.8 64 Bit System.

0 Karma

geoeldsul
Explorer

Mine was not actually a single runaway process. In using top I forgot about the ability to expand it to show each individual CPU. After doing this I found no one CPU being used to its max. The high cpu usage as a value showing the total of multiple CPUs.

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

[Puzzles] Solve, Learn, Repeat: Character substitutions with Regular Expressions

This challenge was first posted on Slack #puzzles channelFor BORE at .conf23, we had a puzzle question which ...

Splunk Community Badges!

  Hey everyone! Ready to earn some serious bragging rights in the community? Along with our existing badges ...

[Puzzles] Solve, Learn, Repeat: Matching cron expressions

This puzzle (first published here) is based on matching timestamps to cron expressions.All the timestamps ...