Getting Data In

splunkd Process at 100%, parsingQueue at 1000, how do I determine where the issue lies?

stephanbuys
Path Finder

As per another topic on "answers" I executed the following search:

index=_internal source=metrics.log group=queue | timechart perc95(current_size) by name

This confirms that my parsingqueue is almost always at 1000, which would probably explain why I have one splunkd process constantly using 100% of 1 out of 4 CPU's.

I am also receiving the following sequence of errors every 300ms from the splunkd.log, it might be a coincidence, it might be the cause.

02-22-2011 19:08:59.772 ERROR TcpInputProc - Received unexpected 68021378 byte message! from hostname=txxxxxxxxxx, ip=10.xxxxxxxx, port=45384

02-22-2011 19:08:59.772 INFO  TcpInputProc - Hostname=txxxxxxxxxxxx closed connection

02-22-2011 19:08:59.855 INFO  TcpInputProc - Connection in cooked mode from txxxxxxxxxxxx

02-22-2011 19:08:59.913 INFO  TcpInputProc - Valid signature found

02-22-2011 19:08:59.913 INFO  TcpInputProc - Connection accepted from txxxxxxxxxxx

Is it possible that some input from a forwarder keeps getting reprocessed?

Any pointers truly welcome.

1 Solution

jrodman
Splunk Employee
Splunk Employee

The TcpInputProc errors you are seeing are mangled or invalid input on a splunktcp input. It might not be splunk at all, but something else connecting to that socket. If so, you could quiesce the source program, or firewall the access.

Alternatively that might be a quite old 4.0.x /3.4.x forwarder which is doing bad things with heartbeats. If it is a splunk forwarder, make sure it is running a relatively current version.

Splunk using 100% cpu is not so odd, if it has work to do. If it is getting behind, then it may be useful to look at cpu time by processor in metrics to see where most of the time is being spent.

Indexing can get behind by bottlenecks of disk write speed, or cpu exhaustion. I'd use system tools to get an idea about these (top, iostat). Then dig in further along those lines.

This probably becomes a support case, but you can get started if you want, with links like:

http://www.splunk.com/wiki/Community:PerformanceTroubleshooting

http://www.splunk.com/wiki/Deploy:Troubleshooting

View solution in original post

jrodman
Splunk Employee
Splunk Employee

The TcpInputProc errors you are seeing are mangled or invalid input on a splunktcp input. It might not be splunk at all, but something else connecting to that socket. If so, you could quiesce the source program, or firewall the access.

Alternatively that might be a quite old 4.0.x /3.4.x forwarder which is doing bad things with heartbeats. If it is a splunk forwarder, make sure it is running a relatively current version.

Splunk using 100% cpu is not so odd, if it has work to do. If it is getting behind, then it may be useful to look at cpu time by processor in metrics to see where most of the time is being spent.

Indexing can get behind by bottlenecks of disk write speed, or cpu exhaustion. I'd use system tools to get an idea about these (top, iostat). Then dig in further along those lines.

This probably becomes a support case, but you can get started if you want, with links like:

http://www.splunk.com/wiki/Community:PerformanceTroubleshooting

http://www.splunk.com/wiki/Deploy:Troubleshooting

jrodman
Splunk Employee
Splunk Employee

Glad to hear it is fixed! Sorry it is tricky to handle investigation cases in splunk answers.

0 Karma

stephanbuys
Path Finder

Your hints helped us identify the aggqueue and parsingqueue and the culprits. This answer from Gerald helped us fix it:
http://answers.splunk.com/questions/1142/the-aggqueue-and-parsingqueue-consistently-full-blocked-how...

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Why Splunk Customers Should Attend Cisco Live 2026 Las Vegas

Why Splunk Customers Should Attend Cisco Live 2026 Las Vegas     Cisco Live 2026 is almost here, and this ...

What Is the Name of the USB Key Inserted by Bob Smith? (BOTS Hint, Not the Answer)

Hello Splunkers,   So you searched, “what is the name of the usb key inserted by bob smith?”  Not gonna lie… ...

Automating Threat Operations and Threat Hunting with Recorded Future

    Automating Threat Operations and Threat Hunting with Recorded Future June 29, 2026 | Register   Is your ...