Getting Data In

Newbie Question: Getting Data into a Distributed Cluster

zadunn
Engager

Hey all!

I am trying to understand splunk a little better. I am trying to setup a search head and two indexers. I have all that configured (well everything is added into the search head). Now I am wondering, aside from the splunk forwarder handling automatic load balancing between the two index nodes, what is the best practice on getting data into the indexes? Put more clearly, say i want to collect rsyslogd data on port 514. Do I need to configure each indexer, and then make sure that I am alternating which 'nix boxes i am assigning to which indexer? Or do I need to configure the search head as a forwarder, use that as a single point of entry for everything (how well would that scale?) and then let the splunk forwarder LB it between the two indexers? Do i need to create the indexes manually on each index node?

Lots of questions, like I mentioned I am new to all this.

Thanks!

Zach

Tags (2)

gkanapathy
Splunk Employee
Splunk Employee

You should forward to a Splunk forwarder (preferably not your search head), which will then distribute the data among the nodes of your indexing cluster. For UDP syslog packets in particular, you can use a hardware load balancer or some other way to scatter the packets, but you can't do this with TCP streams. You don't really need a separate dedicated search head with only two indexers, and if it's similar hardware, I'd say that you will do better using it as a third indexer and then picking one of those and using it as your search head at the same time.

Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

[Puzzles] Solve, Learn, Repeat: Matching cron expressions

This puzzle (first published here) is based on matching timestamps to cron expressions.All the timestamps ...

Why Splunk Customers Should Attend Cisco Live 2026 Las Vegas

Why Splunk Customers Should Attend Cisco Live 2026 Las Vegas     Cisco Live 2026 is almost here, and this ...

Data Management Digest – May 2026

Welcome to the May 2026 edition of Data Management Digest!   As your trusted partner in data innovation, the ...