Getting Data In

Is there a maximum number of forwarders per indexer?

Jeremiah
Motivator

Is there a maximum number of forwarders that a single indexer can support, or is the limiting factor on the indexer just the amount of data sent by the forwarders? If there is a maximum # of forwarders per indexer, how does that scale in an auto-lb environment with multiple indexers?

Tags (2)
2 Solutions

Dan
Splunk Employee
Splunk Employee

I don't know of a theoretical limit. The highest I've seen is 6,000 forwarders reporting to a single indexer. It was a subset of WinEventLogs, so in aggregate the volume was low - about 150GB/day. Splunk does a good job of managing connections, and the overhead is low so long as you don't require SSL. You will have to increase the file descriptor limit (ulimit -n) to allow for all those sockets to be referenced.

Assuming there is a limit somewhere, you make a good point that auto-lb would not help. I'd recommend an intermediate forwarding tier. You would explicitly assign each client forwarder to one of multiple intermediate forwarders, and then have the intermediate forwarders LB amongst the indexing tier. Keep in mind that a forwarder will only be able to LB about 200GB/day of data. You'll also have to be careful which forwarders execute which parts of the parsing pipeline, which can get tricky.

View solution in original post

gkanapathy
Splunk Employee
Splunk Employee

There will be a practical limit imposed by the TCP/IP network stack implementation on the indexer. At a very minimum, there will be a limit on the number of available ports that other servers can connect to, something less than 65,535. There will be lower limits because of reserved ranges, and there may be even lower limits simply because of limitation on the operating system to track a number of connections.

View solution in original post

gkanapathy
Splunk Employee
Splunk Employee

There will be a practical limit imposed by the TCP/IP network stack implementation on the indexer. At a very minimum, there will be a limit on the number of available ports that other servers can connect to, something less than 65,535. There will be lower limits because of reserved ranges, and there may be even lower limits simply because of limitation on the operating system to track a number of connections.

Dan
Splunk Employee
Splunk Employee

I don't know of a theoretical limit. The highest I've seen is 6,000 forwarders reporting to a single indexer. It was a subset of WinEventLogs, so in aggregate the volume was low - about 150GB/day. Splunk does a good job of managing connections, and the overhead is low so long as you don't require SSL. You will have to increase the file descriptor limit (ulimit -n) to allow for all those sockets to be referenced.

Assuming there is a limit somewhere, you make a good point that auto-lb would not help. I'd recommend an intermediate forwarding tier. You would explicitly assign each client forwarder to one of multiple intermediate forwarders, and then have the intermediate forwarders LB amongst the indexing tier. Keep in mind that a forwarder will only be able to LB about 200GB/day of data. You'll also have to be careful which forwarders execute which parts of the parsing pipeline, which can get tricky.

Jeremiah
Motivator

I see. I would only split the forwarders if I had to, it sounds like between what you and gkanapathy have said is that there isn't a hard limit beyond what the TCP/IP stack and the OS will allow.

0 Karma

Dan
Splunk Employee
Splunk Employee

You might be sacrificing some performance at search time. The optimum state for distributed search is to have the data evenly dispersed amongst the indexers. If you can split the forwarders in a way that still disperses the data evenly, I would say go for it.

0 Karma

Jeremiah
Motivator

Thanks. It would be helpful to know if anyone seen more than 6k running. We're potentially going to have over 20k forwarders sending to an autolb cluster (4 indexers running linux). We could split the cluster in half if necessary to reduce the total number of forwaders per auto-lb cluster. That would seem easier than an intermediate forwarding tier, right?

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...