I have a 16 core server (HP DL580) with 32GB MEM and 2TB SAS Drives (RAID 10) capable of 800 IO/sec. I'm indexing about 75GB/day and supporting about 4 concurrent searches.
In a previous post, it seems that indexing maxes out at 3 or 4 cores: http://answers.splunk.com/questions/1874/how-to-take-advantage-of-a-multi-core-indexer
So I have 12 cores available for search, but only 4 concurrent searches.
Should I run 2 instances of Splunk on the box?
Generally, no. The server is more than capable of handling both your search load and your indexing load. Adding another instance might help your index more data, but if it's already getting everything you are giving it, there's really no benefit. Adding instances on a single machine, keeping hardware constant, will not give you more capacity for search, as a single instance is capable of consuming all machine resources for searches.
A server like this can be expected to comfortably sustain indexing rates of over 300 GB/day on a single instance, assuming little or no search activity.
Our experience has been that there is likely to be much more growth in search activity than you are seeing now, once more people in your organization become more aware of what they can do once this data is in Splunk. We think it is unlikely that you will experience more than a quadrupling of your input data without a corresponding increase in search activity. If this does occur, you are going to need to accommodate the growth with an additional indexer. However, if it is the case that your search activity will never be higher than today, and that your indexing volume may increase more than four or five times, you could get more from this hardware by running multiple instances of Splunk on the machine. (Note this can not be done under Windows.) You should nevertheless be aware that adding another indexer will bring many other performance benefits.
In future, as machines with more than 24 cores become more widely available and used, it may become more likely that multiple instances will make more sense. By that time, however, there may have been changes made to the Splunk indexing architecture to render this moot.
I disagree, if you benchmark search performance with one instance versus two on the same box, two instances will consistently perform better than one for almost all tasks. In benchmarking multiple instance performance, I have found increased performance up to 4 instances, afterwhich returns begin to diminish.
I'll note that it's not really a goal to use up more cores. Usually, you want to make sure that your workload requirements are being met with the fewest resources possible. If it is, it should not be a concern that is it not using more CPU. If a machine is not meeting workload requirements, then it should first be ascertained what resource is short. If it is in fact CPU processing ability that is lacking, then we can worry about trying to provide more CPU to it, and then show concern if it is not using it.
When benchmarking search performance, it seems that one instance can rarely if ever utilize all resources on one box, assuming you are concerned first and foremost about the speed of any one search. This seems to be due to inefficiencies in parallel CPU and I/O operations. As I mentioned in my comment above, on 16 core boxes per-search performance seems to increase until 4 instances, after which it tends to decrease.
Looks like we have some conflicting opinions. I was recently lucky enough to get a 24 core server (HP DL585). What should I do with that?
Benchmark various densities of searches with varying concurrences on a large data set, using one instance and then again with several.