Monitoring Splunk

High CPU Usage of one indexer cluster peer node

klowk
Path Finder

Hi community,

one of our indexer cluster peer nodes has a very high cpu consumption by splunkd process see attached screenshot.

klowk_0-1596720912153.png

We have an indexer cluster that consists of two peer nodes and one master server. Only one of the peer nodes has this performance issue.

A few Universal Forwarders and Heavy Forwarders get sometimes following timeout from the peer node which has the high load.

08-06-2020 15:02:47.001 +0200 WARN TcpOutputProc - Cooked connection to ip=<ip> timed out

I already looked to the Monitoring Console. In the graph CPU Usage by Process Class i see 117 percent for splunkd server.

klowk_1-1596720970367.png

But what means splunkd server here?
Where can i get further information to solve this issue?

Thanks for your advice.

kind regards
Kathrin

0 Karma

shivanshu1593
Builder

Looks like your indexers are getting overloaded. You may want to consider adding another set of Indexers to distribute the load evenly.

For the CPU utilisation, check the following:

1. Check the status of Indexers' queues. Most likely they'd be blocked due to overload. If yes, then please check the Indexing rate and the amount of data coming in, along with the dats quality. All these factors contribute to high CPU utilisation.

2. Check the values of Noproc and Nofile (ulimit -u and ulimit -n) and consider raising the values. Please post the output here, so that we can have a look as well.

3. Right next to the CPU graph, check for the memory graph as well. What does that tell you? Is the memory mainly consumed by searches or something else? If it's searches, you've got a lot of searched running at the same time, overwhelming your servers.

4. Are the IOPS of your server enough? The incoming load vs how much the server can handle, parse and write to disk also contributes to CPU utilisation.

5. Check if Real time searches are being run in your environment. If yes, please get rid of them. They never end and take a heavy toll on your Indexers.

6. Under search, search - instance, look for top 20 memory consuming searches. Validate them if they are running from a long time and using correct syntax or not. Anything with index=* and All time as timerange also contribute to all this consumption (I ended up disabling real time searches for everyone, all time timestamp for everyone except admins)

7. What are the error messages in splunkd. You may want to look at them as well.

Hope this helps,

S

Thank you,
Shiv
###If you found the answer helpful, kindly consider upvoting/accepting it as the answer as it helps other Splunkers find the solutions to similar issues###
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...