Solved: Sharding an index on shared storage for load balan...

aholzer · ‎09-06-2012

Goal:

Load balance across two indexers writing to the same location on a NetApp Filer (NFS)

2 Indexers
Multiple Forwarders
A sharded index on NFS
Search Head

Question: (I am new to Splunk, so I may be asking the wrong question to begin with)

How can I configure my splunk setup with an index on shared storage to handle dynamic load-balancing between two indexers?

My understanding:

In isolated storage, the index that the indexers write to would be named the same and this would be logically the same index, just split between two locations.
Sharding the index manually would involve several steps: a) creating a new index, b) pointing different forwarders to each index separately (defeating the dynamic load-balancing between the two indexers), c) manually making searches combine data from both indexes.
While an indexer writes to an index, the indexer holds a lock on the index, not allowing any other indexer to write or read from the index.

Thank you very much!

gkanapathy · ‎09-06-2012

Your understanding is mostly correct. (Although it doesn't really change anything, you can read from and execute searches against an index shard that you're not writing to, at least in theory. There is an index setting in indexes.conf isReadOnly that supposedly makes an instance not write to an index, but I've never used it. You are correct that only one instance can write to an index shard though; I'm not sure if there is actually any lock on the files, or if the instance simply assumes that it's the sole owner though.)

You will need to set up basically four instances of Splunk, two on each node (one active and one failover) and two "shards" of each index:

On the NFS, two separate index "shards", call them iA and iB.
On server node 1, an active Splunk instance, call it sA-1, that is active and reads and writes from iA.
On server node 2, an active Splunk instance, call it sB-1, that is active and reads and writes from iB.
On server node 2, a standby failover Splunk instance, call it sA-2, that is normally shut down, but configured to iA.
On server node 1, a standby failover Splunk instance, call it sB-2, that is normally shut down, but configured to iB.

You will have to adjust the network port numbers so that the sA-* instances don't conflict with the SB-* instances if both are running on the same node.

In case of a failure, you would ensure that the failed node and splunkd process were stopped, the start up the corresponding standby instance on the other node. You would also do whatever was needed to switch the IP/hostname of the instances to point to the standby node. This can be done manually, or via clustering software, or VIP on a network load balancer.

I will also warn that while indexing over NFS will work, it is harder to guarantee the IOPs you'd like to have for excellent search performance. If your NFS is up to it, it should work fine. However, since no shard will be used on more than one node at a time, it's possible to use SAN volumes rather than NFS for each index shard.

I will also add that most of what you get from this setup will be rendered unnecessary by index replication within the Splunk product in an upcoming release. It is quite different from what I've described here, but provides similar functionality.

View solution in original post

gkanapathy · ‎09-06-2012

Your understanding is mostly correct. (Although it doesn't really change anything, you can read from and execute searches against an index shard that you're not writing to, at least in theory. There is an index setting in indexes.conf isReadOnly that supposedly makes an instance not write to an index, but I've never used it. You are correct that only one instance can write to an index shard though; I'm not sure if there is actually any lock on the files, or if the instance simply assumes that it's the sole owner though.)

You will need to set up basically four instances of Splunk, two on each node (one active and one failover) and two "shards" of each index:

On the NFS, two separate index "shards", call them iA and iB.
On server node 1, an active Splunk instance, call it sA-1, that is active and reads and writes from iA.
On server node 2, an active Splunk instance, call it sB-1, that is active and reads and writes from iB.
On server node 2, a standby failover Splunk instance, call it sA-2, that is normally shut down, but configured to iA.
On server node 1, a standby failover Splunk instance, call it sB-2, that is normally shut down, but configured to iB.

You will have to adjust the network port numbers so that the sA-* instances don't conflict with the SB-* instances if both are running on the same node.

In case of a failure, you would ensure that the failed node and splunkd process were stopped, the start up the corresponding standby instance on the other node. You would also do whatever was needed to switch the IP/hostname of the instances to point to the standby node. This can be done manually, or via clustering software, or VIP on a network load balancer.

I will also warn that while indexing over NFS will work, it is harder to guarantee the IOPs you'd like to have for excellent search performance. If your NFS is up to it, it should work fine. However, since no shard will be used on more than one node at a time, it's possible to use SAN volumes rather than NFS for each index shard.

I will also add that most of what you get from this setup will be rendered unnecessary by index replication within the Splunk product in an upcoming release. It is quite different from what I've described here, but provides similar functionality.

aholzer · ‎09-06-2012

In which case the load-balancing at the forwarders would only need to worry about the indexers, the indexers would write their own index version of the index AND the search head(s) would basically treat it as if the index were being written locally on each of the separate indexers.

That's a very simple and elegant solution. Thanks.

gkanapathy · ‎09-06-2012

Well, I would say sA-1 knows only iA (shard1/index1), and sA-2 is standby on the other node, but knows the same iA (shard1/index1) on the other node. sB-1 is on the same node as sA-2, and knows iB (shard2/index1)

aholzer · ‎09-06-2012

Am i understanding you correctly; the indexes are named the same, but have different paths on the nfs. In this case, we would have to make sure the indexers only know about one of the indexes.

iA and iB are named "index1" with paths (on the NFS): /splunk/shard1/index1 /splunk/shard2/index1

and

SA-1 knows only /splunk/shard1/index1
SA-2 knows only /splunk/shard2/index1

Edit: (since we are not currently concerned about resiliency)

SA knows only /splunk/shard1/index1
SB knows only /splunk/shard2/index1

gkanapathy · ‎09-06-2012

the index name is the same on both sides. The shard does not have distinct name, it's just a different path that is set within the indexer config. If you mean "indexer", rather than "index", though, you just list both instances (or rather the virtual names/ips of each primary instance) and let forwarder load-balancing deal with it.

aholzer · ‎09-06-2012

Thanks for the response.

Ignoring resiliency, and relying on the load-balancing feature on the forwarders, how would I specify the index name in the inputs.conf since we won't know which indexer it's feeding? Unless the intention is to hardcode the load-balancing...

Sharding an index on shared storage for load balancing

Explore the Latest Educational Offerings from Splunk (November Releases)

New This Month in Splunk Observability Cloud - Metrics Usage Analytics, Enhanced K8s ...

Alerting Best Practices: How to Create Good Detectors