We are planning to roll out an active/active solution in two data centers. Each DC splunk instance will only handle local logs. We are looking to put a plan in place should a failure of one site occur.
What is the best mechanism/tools to handle this situation.
What should be planned for e.g. use different index names in each site e.g. main_site1 and main_site2? Both sites will handle similar logs and peer to each other for searches.
There's a comment about the isReadOnly setting in indexes.conf which you may want to look into as well. The idea is to prevent accidental writing to a replicated index. It's possible that read-only file permissions may also be a good idea in this case. (BTW, I assuming that you aren't going to try to let both sites index data into both each others directories, because that will not work.)
It may be a good idea to work though a plan with splunk support for your specific setup.
If you plan to move one site to another in case of a failure, then I would do as you wrote in question 2. Call them main_site1 and main_site2. This way you are able to move the index back to its original site and even keep the data collected while it was on the failure site.