Frequently i am receiving high CPU Usage alerts with over 99% on all 3 indexers.
I am unable to search any query. It shows waiting for quequed job to start.
Please help me here. How to check the issue and resolve it.
At first, some quick questions:
If you're using a correct HW configuration and you haven't special requirements from some App, you can see if there are some heavy scheduled searches that make your system busy.
E.g., if you're using Real Time Searches, you tale a CPU for each search you're sunning, so if you have some real time search with one or two subsearches you're filling your system.
Then, are you usung searches with transaction or join commands? they are very expensive for resources.
You can check the running searches, as @richgalloway said, using the Monitoring Console [Settings -- Resource Usage -- CPU Usage: Instance].
We also had high CPU usage on our indexers in a test enviroment and query's took a very long (45 min)time.
In the monitoring of the servers we noticed that the cpu ready time was between 1 and 5 Ms. The usage of the server was 100%
The server had 8 Vcpu's.
By reducing the number of vcpu's to 2 vcpu;s per server the cpu ready time reduced.
This gave a large peformance boost on the indexers query's time was reduced for the same query to 5 Min
So my advice if you are running in a virtualized enviroment play with the number of vcpu's to get the optimal peformance. I know the say minimum 16 vCpu but if you have high ready times it is worth to try.
Look at the number of MHZ the server Uses and divide this by the speed of your cores en set this number of vcpu's to begin with.
@richgalloway Thanks for your reply.
Checked the Monitoring Console (Settings->Monitoring Console->Indexing->Performance->Indexing Performance: Advanced)
But its not showing any details.
Showing no results found. Any other alternative ways to get a solution?
In order to find the root cause, in your Indexers, go to Monitoring console -> Resource usage -> Resource usage: Instance.
Under the snapshots section, you'll find two very important graphs. Physical memory usage by process class and CPU Usage by process class.
You'd want to look at the CPU usage graphs and see what's causing the hogging of CPU utilisation. If it says Search, that means you have to look at people running resource intensive searches in your environment. For that, you'd want to go to Search -> Search Activity: Instance and check everything out from there.
You can always create alerts off the audit data to find violators, who are running long running searches, or too many searches etc. If it's not a search issue, please contact Splunk support, as it maybe a case of memory leakage.
Here's the article that you can also go through for this:
Hope this helps,
Note: If this helped, please mark it as an accepted answer.