Deployment Architecture

Where to reallocate CPU cores, SH or IDX?

DaClyde
Contributor

Our Splunk environment is chronically under resourced, so we see a lot of this message:

  • [umechujf,umechujs] Configuration initialization for D:\Splunk\etc took longer than expected (10797ms) when dispatching a search with search ID _MTI4NDg3MjQ0MDExNzAwNUBtaWw_MTI4NDg3MjQ0MDExNzAwNUBtaWw__t2monitor__ErrorCount_1694617402.9293. This usually indicates problems with underlying storage performance.

It is our understanding that the core issue here is not so much storage, but processor availability.  Basically Splunk had to wait 10.7 seconds for the specified pool of processors to be available before it could run the search.  We are running a single SH and single IDX.  Both are configured for 10 CPU cores.  Also, this is a VM environment, so those are shared resources.  I know, basically all of the things Splunk advises against (did I mention also running Windows?).  No, we can't address the overall resource situation right now.

Somewhere the idea came up that reducing the quantity of cores might help improve processor availability, so if Splunk were only waiting for 4 or 8 cores, it would at least get to the point of beginning the search with less initial delay as it would have to wait for a smaller pool of cores to be available first.

So our question is, which server is most responsible for the delay, the SH or the IDX?  Which would be the better candidate for reducing the number of available cores?

 

0 Karma
1 Solution

isoutamo
SplunkTrust
SplunkTrust

isoutamo
SplunkTrust
SplunkTrust

Hi

here is couple of old answers what you should consider when running splunk on VMware.

You must check IOPS for whole ESXi level + VM level. Also ensure that vCpu count is less than core count in smallest socket.

r. Ismo

DaClyde
Contributor

Thank you, I will have our team look into these and see if there is anything we can salvage of our current system.  I feel like Goofy from Disney's Jack & the Beanstalk, slicing bread so thin it is transparent.

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @DaClyde ,

I love that cartoon!

good for you, see next time!

let us know if we can help you more, or, please, accept one answer for the other people of Community.

Ciao and happy splunking

Giuseppe

P.S.: Karma Points are appreciated by all the contributors 😉

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @DaClyde ,

the requested reference hardware is 12 CPUs and 12 GB RAM for both the servers if you don't have Premium App (ES or ITSI) and it depends on the number of users and scheduled searches, this resources must be dedicated not shared.

In addition the bottleneck of each Splunk infrastructure is the storage performances: Splunk requires at least 800 IOPS.

You can analyze the indexing and search performances using the Monitoring Console App.

Then you coult trasform eventual real time searches in scheduled searches.

Ciao.

Giuseppe

0 Karma
Get Updates on the Splunk Community!

Now Available: Cisco Talos Threat Intelligence Integrations for Splunk Security Cloud ...

At .conf24, we shared that we were in the process of integrating Cisco Talos threat intelligence into Splunk ...

Preparing your Splunk Environment for OpenSSL3

The Splunk platform will transition to OpenSSL version 3 in a future release. Actions are required to prepare ...

Easily Improve Agent Saturation with the Splunk Add-on for OpenTelemetry Collector

Agent Saturation What and Whys In application performance monitoring, saturation is defined as the total load ...