Getting Data In

How many pipelines should I use on a forwarder?

davidpaper
Contributor

I'm trying to figure out how many pipelines to set on my forwarders to maximize the following:

  • Throughput
  • Data distribution to my indexers
  • Resource utilization

What are the things I need to be aware of when adding more pipelines? The default is 1.

1 Solution

davidpaper
Contributor

In discussions with Architecture gurus at Splunk, including @jkerai, there are some general guidelines to answer the question.

Pipeline count

While you can technically run many pipelines (recently tested running 12), we had diminishing results beyond 3. The main challenge is to keep all UF pipelines balanced and getting data fed to them. In most cases due to the way UFs get their data, not all pipelines are busy and thus aggregate thruput from UF is barely past 30-40MB/s.

If you are reading data from disk by monitoring files, you will need a strategy that ensures that there are enough files to be read in parallel by different UF pipelines.

If data is coming over raw TCP, there should be good number of connections coming onto UF so that they get evenly spread across different pipelines

In majority of the cases I would guess that data is coming from few big files that is constantly being written to. This lends to high utilization for few pipelines and very low util for remaining ones.

Throughput & Data distribution

Each pipeline makes its own connection to one of the entries listed in outputs.conf. So, 3 pipelines will connect to 3 different entries listed in outputs. Pipelines work independently, roughly equivalent to having multiple UFs installed and running concurrently. Multiple pipelines randomly establish connections to the next hop, but statistically they should be talking to different indexers due to randomness.

Each pipeline gets its own allocation of limits [thruput] maxKBps=# setting. If this is set to 0 (unlimited), then all pipelines shove as much throughput as they can push and the remote side can accept. Example: If maxKBps=5000, then each pipeline gets a max of 5MBytes/sec (yes bytes not bits) of throughput. This value is enforced before the data goes through the outbound compression routines, so the amount of data that appears on the wire should be considerably smaller (roughly 90% compression on average). So, 3 pipelines at 5MB/s = 15MB/s raw * 0.1 (compression ratio) = 1.5MB/s or 12Mb/s on the wire.

Adding extra pipelines to your forwarder can help maintain a 2:1 forwarder:indexer pipeline ratio, which helped data distribution be more even across indexers. The higher the ratio, the more evenly distributed data is across the indexing tier. This matters when it comes to search performance (you want all indexers participating in all searches whenever possible) and balanced disk usage.

Resource utilization

Each pipeline enabled takes up resources of memory, CPU, disk and network. The one that seems to be the problem most often is CPU. A UF pipeline can consume 2 full cores. A HWF pipeline can consume 4 cores. So, 3 UF pipelines can chew up 6 cores on the host running the forwarder. 3 pipelines on a HWF could use up to 12 cores.

RAM usage will also grow as each pipeline has its own queues and buffers that it maintains. If you have tuned your output buffers or queue sizes, be prepared for RAM usage to grow accordingly. Forcing your forwarders to dig into swap space is never a good idea for a production server!

Disk can become impacted when persistent queues are enabled on the inputs side. Each pipeline will get its own directory for its queue and could potentially fill it up to max size of the queue if the next hop stops accepting data for a period of time. Make sure you have enough disk space to accommodate full persistent queues on all pipelines.

Network utilization is discussed above.

View solution in original post

woodcock
Esteemed Legend

Same as the number of licks that it takes to get to the center of a Tootsi-Pop: 3

0 Karma

davidpaper
Contributor

In discussions with Architecture gurus at Splunk, including @jkerai, there are some general guidelines to answer the question.

Pipeline count

While you can technically run many pipelines (recently tested running 12), we had diminishing results beyond 3. The main challenge is to keep all UF pipelines balanced and getting data fed to them. In most cases due to the way UFs get their data, not all pipelines are busy and thus aggregate thruput from UF is barely past 30-40MB/s.

If you are reading data from disk by monitoring files, you will need a strategy that ensures that there are enough files to be read in parallel by different UF pipelines.

If data is coming over raw TCP, there should be good number of connections coming onto UF so that they get evenly spread across different pipelines

In majority of the cases I would guess that data is coming from few big files that is constantly being written to. This lends to high utilization for few pipelines and very low util for remaining ones.

Throughput & Data distribution

Each pipeline makes its own connection to one of the entries listed in outputs.conf. So, 3 pipelines will connect to 3 different entries listed in outputs. Pipelines work independently, roughly equivalent to having multiple UFs installed and running concurrently. Multiple pipelines randomly establish connections to the next hop, but statistically they should be talking to different indexers due to randomness.

Each pipeline gets its own allocation of limits [thruput] maxKBps=# setting. If this is set to 0 (unlimited), then all pipelines shove as much throughput as they can push and the remote side can accept. Example: If maxKBps=5000, then each pipeline gets a max of 5MBytes/sec (yes bytes not bits) of throughput. This value is enforced before the data goes through the outbound compression routines, so the amount of data that appears on the wire should be considerably smaller (roughly 90% compression on average). So, 3 pipelines at 5MB/s = 15MB/s raw * 0.1 (compression ratio) = 1.5MB/s or 12Mb/s on the wire.

Adding extra pipelines to your forwarder can help maintain a 2:1 forwarder:indexer pipeline ratio, which helped data distribution be more even across indexers. The higher the ratio, the more evenly distributed data is across the indexing tier. This matters when it comes to search performance (you want all indexers participating in all searches whenever possible) and balanced disk usage.

Resource utilization

Each pipeline enabled takes up resources of memory, CPU, disk and network. The one that seems to be the problem most often is CPU. A UF pipeline can consume 2 full cores. A HWF pipeline can consume 4 cores. So, 3 UF pipelines can chew up 6 cores on the host running the forwarder. 3 pipelines on a HWF could use up to 12 cores.

RAM usage will also grow as each pipeline has its own queues and buffers that it maintains. If you have tuned your output buffers or queue sizes, be prepared for RAM usage to grow accordingly. Forcing your forwarders to dig into swap space is never a good idea for a production server!

Disk can become impacted when persistent queues are enabled on the inputs side. Each pipeline will get its own directory for its queue and could potentially fill it up to max size of the queue if the next hop stops accepting data for a period of time. Make sure you have enough disk space to accommodate full persistent queues on all pipelines.

Network utilization is discussed above.

Get Updates on the Splunk Community!

Now Available: Cisco Talos Threat Intelligence Integrations for Splunk Security Cloud ...

At .conf24, we shared that we were in the process of integrating Cisco Talos threat intelligence into Splunk ...

Preparing your Splunk Environment for OpenSSL3

The Splunk platform will transition to OpenSSL version 3 in a future release. Actions are required to prepare ...

Easily Improve Agent Saturation with the Splunk Add-on for OpenTelemetry Collector

Agent Saturation What and Whys In application performance monitoring, saturation is defined as the total load ...