Solved: How do I tell Splunk to use more CPU cores?

kamermans · ‎07-15-2014

I have Splunk 6 Enterprise installed on a system with 2x 10-core 3GHz Xeons, 128GB RAM and a 6x SSD RAID-10. When I run searches I notice that Splunk seems to use no more than 6 CPU cores despite having 40 CPU cores in total. This is particularly troubling when I do heavy CPU-bound operations like rex and iplocation. I've noticed this even when searching across a large time range, like 3 months. During the search, I/O is very low (relative to the SSD RAID) with no IOWAIT, and RAM usage is low as well. The only bottleneck I see is that the cores in use are at 100%.

Is there some way to tell Splunk to use more CPU cores? My understanding of the default is that is will already be quite aggressive in multithreading, but perhaps there is some hard-coded upper limit?

Here's the view from top during an iplocation-bound query:

top - 23:38:07 up  7:48,  4 users,  load average: 6.36, 6.44, 5.63
Tasks: 425 total,   1 running, 424 sleeping,   0 stopped,   0 zombie
%Cpu(s): 14.1 us,  2.5 sy,  0.0 ni, 83.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem:  13198851+total, 12997244+used,  2016076 free,    82992 buffers
KiB Swap: 13416960+total,   105568 used, 13406403+free. 12418592+cached Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 4836 root      20   0 1334636 637356  15148 S 556.3  0.5 874:09.35 splunkd
38847 root      20   0 2018372 567512  37936 S 101.0  0.4  25:32.94 splunkd
 5620 root      20   0 1904932 152960   4888 S   6.6  0.1   8:03.83 python

Here's the actual query I'm testing:

* | iplocation client_ip | geostats count

A system load of 6.36 might normally be a good indication of a taxing job in progress, but with 40 CPUs, that's only 15.9% CPU usage for this machine.

Here's a view of htop showing the cores in use and what appears to be 6 splunkd threads (I suppose that's the problem):

1  [|||||||         30.1%]    11 [||||||||        31.0%]     21 [                 0.0%]    31 [                 0.0%]
2  [|                0.6%]    12 [                 0.0%]     22 [                 0.0%]    32 [                 0.0%]
3  [||||||||||||||||92.2%]    13 [||||||||||||||||92.9%]     23 [||               2.6%]    33 [                 0.0%]
4  [                 0.0%]    14 [                 0.0%]     24 [                 0.0%]    34 [||               1.3%]
5  [||||||||||||||||94.1%]    15 [||||||||||||||||94.1%]     25 [                 0.0%]    35 [                 0.0%]
6  [                 0.0%]    16 [                 0.0%]     26 [                 0.0%]    36 [                 0.0%]
7  [|                1.3%]    17 [||||||||||||||| 64.9%]     27 [                 0.0%]    37 [                 0.0%]
8  [                 0.0%]    18 [                 0.0%]     28 [                 0.0%]    38 [                 0.0%]
9  [                 0.0%]    19 [||||||||||||||| 66.9%]     29 [|                0.6%]    39 [                 0.0%]
10 [                 0.0%]    20 [|||||||||||||||100.0%]     30 [                 0.0%]    40 [                 0.0%]
Mem[||||||||||||||||||||||||||||||||||||||5654/128895MB]     Tasks: 43, 84 thr; 8 running
Swp[|                                      103/131024MB]     Load average: 6.30 6.40 5.89
                                                             Uptime: 07:55:06
├─ splunkd -p 8089 restart
├─ splunkd -p 8089 restart
└─ [splunkd pid=4836] splunkd -p 8089 restart [process-runner]
  ├─ [splunkd pid=4836] search --id=1405480372.228 --maxbuckets=0 --ttl=600 --maxout=500000 --maxtime=8640000 --lookups=
  │  ├─ [splunkd pid=4836] search --id=1405480372.228 --maxbuckets=0 --ttl=600 --maxout=500000 --maxtime=8640000 --looku
  │  ├─ [splunkd pid=4836] search --id=1405480372.228 --maxbuckets=0 --ttl=600 --maxout=500000 --maxtime=8640000 --looku
  │  ├─ [splunkd pid=4836] search --id=1405480372.228 --maxbuckets=0 --ttl=600 --maxout=500000 --maxtime=8640000 --looku
  │  ├─ [splunkd pid=4836] search --id=1405480372.228 --maxbuckets=0 --ttl=600 --maxout=500000 --maxtime=8640000 --looku
  │  ├─ [splunkd pid=4836] search --id=1405480372.228 --maxbuckets=0 --ttl=600 --maxout=500000 --maxtime=8640000 --looku
  │  └─ [splunkd pid=4836] search --id=1405480372.228 --maxbuckets=0 --ttl=600 --maxout=500000 --maxtime=8640000 --looku
  └─ /opt/splunk/bin/splunkd instrument-resource-usage

jtacy · ‎07-21-2014

Lucas K mentioned that you have headroom in your vertical search capacity and there's a way to take advantage of that. You have enough cores and a downright enviable number of IOPS to run multiple Splunk instances with. You should see near-linear performance gains as you add indexer instances to this host and this will make it even easier to transition to a traditional scale-out architecture later on.

I'd probably set up a search head and two indexers in a distributed search configuration to start; I've personally run multiple IPs on the same host to do this but you can just assign separate ports for management/receiver as well. Maybe even run a separate heavy forwarder instance to forward to the indexers while you're at it. Distributed architecture seems like a core concept of Splunk but I'm not aware of any "rules" against using the tgz installer and building vertically when you have that kind of hardware. Have fun!!

View solution in original post

jtacy · ‎07-21-2014

Lucas K mentioned that you have headroom in your vertical search capacity and there's a way to take advantage of that. You have enough cores and a downright enviable number of IOPS to run multiple Splunk instances with. You should see near-linear performance gains as you add indexer instances to this host and this will make it even easier to transition to a traditional scale-out architecture later on.

I'd probably set up a search head and two indexers in a distributed search configuration to start; I've personally run multiple IPs on the same host to do this but you can just assign separate ports for management/receiver as well. Maybe even run a separate heavy forwarder instance to forward to the indexers while you're at it. Distributed architecture seems like a core concept of Splunk but I'm not aware of any "rules" against using the tgz installer and building vertically when you have that kind of hardware. Have fun!!

adepasquale · ‎01-31-2017

Could you elaborate on this? Are you suggesting running multiple instances on the same machine? With two indexers, do you have to choose which indexer each forwarder goes to? or do you give the forwarder both?

jtacy · ‎01-31-2017

Yes, I was suggesting running multiple instances on the same machine and giving the forwarder both indexers to connect to. However, times have changed with later versions of Splunk and I think this is now considered an anti-pattern because parallelization support is built in. Consider reviewing the .conf 2016 presentation "Harnessing Performance and Scalability with Parallelization".

Slides and recording: https://conf.splunk.com/sessions/2016-sessions.html
Documentation: https://docs.splunk.com/Documentation/Splunk/6.5.2/Capacity/Parallelization

adepasquale · ‎01-31-2017

thank you for pulling me out of that rabbit hole.

kamermans · ‎07-22-2014

Thanks - this is a great suggestion 🙂

martin_mueller · ‎07-17-2014

There are similar configs in limits.conf, e.g. the maximum number of concurrent search jobs, but they factor in the number of available cores.

Things might be faster if you distributed your search over several search peers... depending on the type of search, amount of data, yada yada.

kamermans · ‎07-17-2014

That is a good point, I suppose 7 out of 20 isn't as bad, it's just that as a developer, I know someone put something like "max_search_threads=6" somewhere, and if I increased that number it would certainly be faster.

martin_mueller · ‎07-16-2014

While your hyperthreading Xeons may indicate 40 CPUs you only have 20 cores. 7 out of 20 still has room to grow, but it doesn't sound as bad as 7 out of 40 🙂

Lucas_K · ‎07-15-2014

...continued
This is the better situation to be in. The reverse is much harder to fix. Even load balancing using snmp metrics you can really struggle to get optimal search concurrency across separate search heads (i'm experiencing this issue right now).

The performance issue you are talking about in regards to specific commands like iplocation is that you are (as LGuinn already said) allocated 1 core per search. Unfortunately that is just how it works.

Lucas_K · ‎07-15-2014

Splunk doesn't work in the way you'd expect it to in term of cpu utilisation.

It won't go and use 40 cpus if available, i wish it would ,rather it allows a specific number of searches to run.

With how your server is provisioned what you have is vertical search capacity. That is you can run more searches (higher search concurrency) and have more logged in people before you run into search contention/queuing.

lguinn2 · ‎07-15-2014

I believe that Splunk will use one core for each search, one core for each logged-in user, and some number of cores (I am no longer sure of the max) for indexing.

Is this machine a search head or an indexer or both?

Finally, this is why you should follow the Splunk machine sizing recommendations - to get more value for your money. I would have probably purchased several commodity-sized servers instead of one big one...

Hardware Capacity Planning

ryanf · ‎06-09-2025

"...Splunk will use one core for each search"

yep by default splunk will use 1 core for each search but can we adjust this limitation, let say one search can use 2 or 3 core?

PickleRick · ‎06-09-2025

Oooof, that's a golden shovel for you, Sir. 😉

But to the point - no. It's how Splunk works. It will allocate a single CPU for each search on a SH it's being run from as well as on each indexer taking part in the search. So the way to "add cores to the search" is to grow your env horizontally in the indexer layer _and_ write your searches so that they use that layer properly.

kamermans · ‎07-15-2014

The system is the search head and the indexer. Currently we are more constrained on physical space and power than on cost, plus our volume is low (under 5GB/day), so I figured a single system with super-fast CPU and I/O would be preferred in my case.

I took the Hardware Capacity Planning Questionnaire and each answer was "no", which indicates that a single machine is the recommended configuration in my case, and I basically took the recommended hardware specs and bumped them up a bit.

I'll keep your advice in mind, but I really don't want a Hadoop-style cluster 😞

How do I tell Splunk to use more CPU cores?

Splunk Observability for AI

Splunk Enterprise Security 8.x: The Essential Upgrade for Threat Detection, ...

Splunk Observability as Code: From Zero to Dashboard

Are you a member of the Splunk Community?

How do I tell Splunk to use more CPU cores?

Splunk Observability for AI

Splunk Enterprise Security 8.x: The Essential Upgrade for Threat Detection, ...

Splunk Observability as Code: From Zero to Dashboard