Getting Data In

When to add indexer/forwarder

lbogle
Contributor

Hello Splunkers,
I came across a page that answered this once but I can't seem to find it again...
For best practices purposes, what is a good rule of thumb you should follow when deciding to add an indexer? How about a forwarder?
I think I heard that performance will start being impacted when an indexer starts to consume more than 50GB a day and that a new indexer should be added then. This is assuming that the indexer was built to the standard performance baseline.
How do you troubleshoot the forwarder buffers/queues to see if they're getting backed up?
Thanks for any assistance!

0 Karma
1 Solution

martin_mueller
SplunkTrust
SplunkTrust

For hardware capacity planning there's this: docs.splunk.com/Documentation/Splunk/6.1.1/Installation/CapacityplanningforalargerSplunkdeployment and this: http://docs.splunk.com/Documentation/Splunk/6.1.1/Deploy/HardwarecapacityplanningforadistributedSplu... which includes a neat table of rule-of-thumb numbers:

Daily Volume        Number of Search Users  Recommended Indexers Recommended Search Heads
< 2 GB/day              < 2                   1, shared          N/A
2 to 250 GB/day     up to 4                   1, dedicated       N/A
100 to 250 GB/day   up to 8                   2                    1
200 to 300 GB/day   up to 12                  3                    1
300 to 400 GB/day   up to 8                   4                    1
400 to 500 GB/day   up to 16                  5                    2
500 GB to 1 TB/day  up to 24                 10                    2
1 TB to 20 TB/day   up to 100               100                   24
20 TB to 60 TB/day  up to 100               300                   32 

There's also info in there on how to adapt to virtualized environments.

For your specific environment, grab the SoS app and look at your indexer's queues and processors. http://apps.splunk.com/app/748/

Concerning forwarders, look for logs stating that it's hit its thruput limit... and if it hits that frequently consider increasing it in limits.conf.
Adding forwarders usually is the same as adding hosts that produce input data. Cases where you ingest one source with multiple forwarders are rare.

View solution in original post

martin_mueller
SplunkTrust
SplunkTrust

For hardware capacity planning there's this: docs.splunk.com/Documentation/Splunk/6.1.1/Installation/CapacityplanningforalargerSplunkdeployment and this: http://docs.splunk.com/Documentation/Splunk/6.1.1/Deploy/HardwarecapacityplanningforadistributedSplu... which includes a neat table of rule-of-thumb numbers:

Daily Volume        Number of Search Users  Recommended Indexers Recommended Search Heads
< 2 GB/day              < 2                   1, shared          N/A
2 to 250 GB/day     up to 4                   1, dedicated       N/A
100 to 250 GB/day   up to 8                   2                    1
200 to 300 GB/day   up to 12                  3                    1
300 to 400 GB/day   up to 8                   4                    1
400 to 500 GB/day   up to 16                  5                    2
500 GB to 1 TB/day  up to 24                 10                    2
1 TB to 20 TB/day   up to 100               100                   24
20 TB to 60 TB/day  up to 100               300                   32 

There's also info in there on how to adapt to virtualized environments.

For your specific environment, grab the SoS app and look at your indexer's queues and processors. http://apps.splunk.com/app/748/

Concerning forwarders, look for logs stating that it's hit its thruput limit... and if it hits that frequently consider increasing it in limits.conf.
Adding forwarders usually is the same as adding hosts that produce input data. Cases where you ingest one source with multiple forwarders are rare.

martin_mueller
SplunkTrust
SplunkTrust

That depends on the type of input. In that case I'm guessing syslog? Nothing wrong with having multiple syslog sources handled by one UF.
For robustness it's good practice to have syslog-ng or similar daemons receive the data and let the UF read that log file.

0 Karma

lbogle
Contributor

Thanks Martin. Thats excellent info!
So is it not a good practice to have a single universal forwarder consuming multiple sources of input? For example, I have a single universal forwarder forwarding data from multiple firewalls (approx 25).

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

[Puzzles] Solve, Learn, Repeat: Matching cron expressions

This puzzle (first published here) is based on matching timestamps to cron expressions.All the timestamps ...

Design, Compete, Win: Submit Your Best Splunk Dashboards for a .conf26 Pass

Hello Splunkers,  We’re excited to kick off a Splunk Dashboard contest! We know that dashboards are a primary ...

May 2026 Splunk Expert Sessions: Security & Observability

Level Up Your Operations: May 2026 Splunk Expert Sessions Whether you are refining your security posture or ...