parallelIngestionPipelines = 2, this is considered the optimal setting for most deployments. Increasing it beyond 2 is technically feasible but generally not advised unless you proceed with significant caution and have confirmed your infrastructure can support the additional load.
I tested with 4(not more than this) but experienced instability, especially during bursty loads and when additional apps were introduced. For this reason, I’m keeping the setting at 2. This configuration has proven more stable in my environment.
Theoretically ingest more data in parallel, when you set to 4. But high risk of OOM and crashes. Splunk highly recommends to consult PS if you want to set beyond 2.
Regards,
Prewin
Splunk Enthusiast | Always happy to help! If this answer helped you, please consider marking it as the solution or giving a Karma. Thanks!
As usual - "it depends".
During normal indexing a single pipeline engages 4-6CPU. So if you have a host which does nothing but ingestion processing (a HF), you can relatively harmlessly raise your number of pipelines and the performance scales quite well (maybe not straight linearliy but not much worse).
But on an indexer you have to remember about two things:
1) You're still limited by the fact that you have to write all that to disk at the end of the pipeline (so the performance improvement will be significantly less than linear).
2) Typically indexers mostly do searching after all. So tying CPUs to ingest processing leaves you with much less left resources for searching. That might lead to problems with long running/delayed/skipped searches.
So on a modern reasonably sized box, with a typical use case indeed 1 or 2 parallel ingestion pipelines seem the optimal settings. With a slightly atypical architecture (for example a separate HF layer which does the heavy lifting and indexers only receive the parsed data and write it to disks), you could consider raising the parameter more.
CPU bottleneck.
Hi @verbal_666 ,
I tried parallelPipelines=4 but I came back to 2 because indexing was better than 2 but I had issues in searches that were slower.
Ciao.
Giuseppe
Perfect 👍👍👍
That's what i wanted to know 👏👏👍
Many thanks 👍
parallelIngestionPipelines = 2, this is considered the optimal setting for most deployments. Increasing it beyond 2 is technically feasible but generally not advised unless you proceed with significant caution and have confirmed your infrastructure can support the additional load.
I tested with 4(not more than this) but experienced instability, especially during bursty loads and when additional apps were introduced. For this reason, I’m keeping the setting at 2. This configuration has proven more stable in my environment.
Theoretically ingest more data in parallel, when you set to 4. But high risk of OOM and crashes. Splunk highly recommends to consult PS if you want to set beyond 2.
Regards,
Prewin
Splunk Enthusiast | Always happy to help! If this answer helped you, please consider marking it as the solution or giving a Karma. Thanks!