Hello, we have an indexer cluster of two peers with replication and serach factors set to 2. The latest rolling restart is currently not progressing for four hours, the second peer is in status "Reassigning primaries". Four hours ago initiated a searchable rolling restart from master server's GUI. The first indexer went down for restart and did not return to operation for10 minutes. When logging in under root and then running "su splunk; /opt/splunk/bin/splunk status" saw the following: splunkd 26239 was not running. Stopping splunk helpers... Repeating "/opt/splunk/bin/splunk status" returned the output: splunkd is not running. We then started Splunk application by running "/opt/splunk/bin/splunk status". The server went up and the peer joined the cluster. Starting from that moment the second peer changed status to "Reassigning primaries" and nothing happens up to this moment. The cluster is in maintenance mode, no fixup tasks are performed, currently have 6k+ of them pending. Search and replication factors are not met for almost all production indexes, 8 of them being not fully searchable. How can we finish the rolling restart or at least cancel it? Thank you for your time and assistance!
... View more
Dear Splunk experts, Dear community,
I am currently planning a change in our Splunk environment to increase reliablity and scalability. Currently running a single indexer with a number of Search Heads.
The goal is to set up the environment to continue operations in case of any single host outage. Would like to set up a cluster of two indexers for this.
We store indexes on mirrored SAN so that it will be operable if the main node is down - standby will have full copy of data.
It is possible to split volume on SAN to two equal parts, make partitions for the indexers and set Replication factor = 2. In that case we will have four copies of data stored (2 peers * 2 SAN nodes) and twice less volume for indexes.
Is there a better way to store data in our case without number of copies overkill and with no loss of capacity? Setting RF=1 is not an option because half of indexed data will be not available in case of an indexer peer loss.
Can we make two indexer peers work with the same SAN partition for writing and reading data?
... View more