I recently ran into the same issue, captain elected but missing in member list and didn't respond to other members anymore. A reboot helped, but not for long, cluster changed to unstabil pretty quick again. Started digging deeper and found the dispatch directory filling (+150k directories) and reaper didn't clean up, so I/O went up crazy. I identified a RT scheduled search causing splunk (6.5.5) keeping all the rt_scheduler__nobody* directories. A rewrite of the search fixed it. Cleaning the dispatch and cluster was running fine again. I afraid i spotted a possible bug in this version.
... View more