Getting Data In

How to improve indexing thruput if replication queue is full?

hrawat
Splunk Employee
Splunk Employee


Here are the configs for on-prem customers willing to apply and avoid adding more hardware cost.
9.4.0 and above most of the indexing configs are automated that’s why dropped from 9.4.0 suggested list.

Note: Assuming replication queue is full for most of the indexers and as a result indexing pipeline is also full however indexers do have plenty of idle cpu and IO is not an issue.


On-prem Splunk version 9.4.0 and above
Indexes.conf

[default]
maxMemMB=100

Server.conf
[queue]
autoAdjustQueue=true

Splunk version 9.1 to 9.3.x
Indexes.conf
[default]
maxMemMB=100
maxConcurrentOptimizes=2
maxRunningProcessGroups=32
processTrackerServiceInterval=0

Server.conf
[general]
parallelIngestionPipelines = 4
[queue=indexQueue]
maxSize=500MB
[queue=parsingQueue]
maxSize=500MB
[queue=httpInputQ]
maxSize = 500MB

maxMemMB, will try to minimize creation of tsidx files as much as possible at the cost of higher memory usage by mothership(main splunkd).
maxConcurrentOptimizes, on indexing side it’s internally 1 no matter what the setting is set to. But on target replication side launching more splunk-optimize processes means pausing receiver until that splunk-optimize process is launched. So reducing it to keep receiver do more of indexing work than launching splunk-optimize process. With 9.4.0, both source (indexprocessor) and target(replication in thread) will internally auto adjust it to 1.
maxRunningProcessGroups, allow more splunk-optimize processes concurrently. With 9.4.0, it's auto.
processTrackerServiceInterval, run splunk-optimize processes ASAP. With 9.4.0, you don't have to change.
parallelIngestionPipelines, have more receivers on target side. With 9.4.0, you can enable auto scaling of  pipelines.
maxSize, don’t let huge batch ingestion by HEC client block queues and receive 503. With 9.4.0 autoAdjustQueue set to true, it's no more a fix size queue.

Labels (1)
Tags (1)
0 Karma

gjanders
SplunkTrust
SplunkTrust

A quick clarification on the 9.4.0 settings for server.conf, you have mentioned


On-prem Splunk version 9.4.0 and above

Indexes.conf
[default]

maxMemMB=100

Server.conf
[general]
autoAdjustQueue=true

The spec file for server.conf appears to show that autoAdjustQueue under the [queue] stanza, should it be under [queue]?

With the indexes.conf setting, does that number multiply out based on the number of indexes configured?
Should I be more cautious when having 1000 indexes configured vs having 100 indexes configured?
I'm unsure when the "max memory" usage might occur from that setting...

Thanks

0 Karma

hrawat
Splunk Employee
Splunk Employee

Yes `maxMemMB=100` will be applied to each index. You can set this config to high volume indexes instead of globally.

0 Karma

hrawat
Splunk Employee
Splunk Employee

Thanks for pointing to mistake in stanza. Yes it has to be [queue].

0 Karma

kiran_panchavat
SplunkTrust
SplunkTrust

@hrawat 

Further Insights on the Suggestion Shared by @gcusello 

  • It is recommended that indexers are provisioned with 12 to 48 CPU cores, each running at 2 GHz or higher, to ensure optimal performance.

  • The disk subsystem should support at least 800 IOPS, ideally using SSDs for hot and warm buckets to handle the indexing workload efficiently.

https://docs.splunk.com/Documentation/Splunk/latest/Capacity/Referencehardware 

  • For environments still using traditional hard drives, prioritize models with higher rotational speeds, and lower average latency and seek times to maximize IOPS.

  • For further insights, refer to this guide on Analyzing I/O Performance in Linux.

  • Note that insufficient disk I/O is one of the most common performance bottlenecks in Splunk deployments. It is crucial to thoroughly review disk subsystem requirements during hardware planning.

  • If the indexer's CPU resources exceed those of the standard reference architecture, it may be beneficial to tune parallelization settings to further enhance performance for specific workloads.

Did this help? If yes, please consider giving kudos, marking it as the solution, or commenting for clarification — your feedback keeps the community going!

gcusello
SplunkTrust
SplunkTrust

Hi @hrawat ,

two little questions:

  • how many CPUs have you on your Indexers?
  • what's the throughput on the storage of your indexers? in other words, have you iowait and delayed searches issues?

probably the problem is related to an insufficient processing capacity, so the easiest solution is adding some CPUs.

If instead the problema is the second, the only solution is changing the storage that hasn't a sufficient IOPS: Splunk requires at least 800 IOPS.

Ciao.

Giuseppe

hrawat
Splunk Employee
Splunk Employee

Added a note to  the original post that indexers are having no IO issues and plenty of idle cpu.

0 Karma
Get Updates on the Splunk Community!

Automatic Discovery Part 1: What is Automatic Discovery in Splunk Observability Cloud ...

If you’ve ever deployed a new database cluster, spun up a caching layer, or added a load balancer, you know it ...

Real-Time Fraud Detection: How Splunk Dashboards Protect Financial Institutions

Financial fraud isn't slowing down. If anything, it's getting more sophisticated. Account takeovers, credit ...

Splunk + ThousandEyes: Correlate frontend, app, and network data to troubleshoot ...

 Are you tired of troubleshooting delays caused by siloed frontend, application, and network data? We've got a ...