I am currenly running a 2 system indexing cluster on Windows VMs. One of the systems is experiencing poor performance causing my search head to run slowly. What I want to do is configure a new VM, install a fresh Splunk Enterprise application and move all incoming data to this new system and bypass the old indexer. With the replication going on between the indexers in this cluster, can I shut down the slow indexer and not lose and indexed data?
If this is not feasible, is there a way to migrate the data from the old system to the new one?
The buckets on the slow indexer will have a guid specific to that indexer in their filenames. That data will not replicate to the new indexer unless you increase the replication factor after adding the indexer. So you could do that, wait for everything to replicate, then shutdown the slow indexer and reduce your replication factor back to 2. You will need to be in normal operations mode for that to work, not maintenance mode.
IF for some reason that doesn't work, then the other option would be to clone the slow indexer to the new indexer, shut down the slow indexer, and re-ip the new indexer to the old ip of the slow indexer. This is probably the cleanest method IMHO.
Still I wonder why your slow indexer is slow... Does it have less cpu/memory or slow disk? Or is it a configuration issue. If it's a configuration issue the 2nd method above will not work.
The buckets on the slow indexer will have a guid specific to that indexer in their filenames. That data will not replicate to the new indexer unless you increase the replication factor after adding the indexer. So you could do that, wait for everything to replicate, then shutdown the slow indexer and reduce your replication factor back to 2. You will need to be in normal operations mode for that to work, not maintenance mode.
IF for some reason that doesn't work, then the other option would be to clone the slow indexer to the new indexer, shut down the slow indexer, and re-ip the new indexer to the old ip of the slow indexer. This is probably the cleanest method IMHO.
Still I wonder why your slow indexer is slow... Does it have less cpu/memory or slow disk? Or is it a configuration issue. If it's a configuration issue the 2nd method above will not work.
It appears to Splunk that the disk is slow. Operations has verified that the VM configuration is the same for the existing systems, same cores, same amount of memory and both on SSD.
In the past, cloning a system has caused issues. That is why I want to do a clean install of Splunk.
As the SSD array is fully allocated, this new system's disk will be on standard storage and I might not be able to get the same amount of disk space on the new system. When I increase the replication factor, does Splunk duplicate every "bucket" across all systems? I am concerned that I might fill up the new system's disk space quickly and cause more response problems.
Increasing the replication factor should cause all the unique buckets on the existing indexers to replicate one copy of themselves to the new indexer.
Once that was complete, if you shut down the slow indexer, and decrease the replication factor, splunk should change those replicated buckets from the slow indexer into searchable copies.
Since data is assumed to be flowing in during this entire process, I missed a step where you want to change your outputs.conf on all your forwarders to point to the new indexer and the old but fast indexer. Meaning remove the slow one add the new one to outputs.conf on the forwarders prior to changing the replication factor.
Yes it will fill the disk quickly.
I added the new system and set the replication factor to 3. Looking at the Indexer Clustering Master Node, it shows all data is searchable, search factor is met and replication factor is met. It also show that there are 3 Replicated Data Copies for the Indexes and 2 searchable data copies. Does that mean the there are 3 copies of all data and I am good to go on shutting down the problem indexer?
hi Scott, Jkat54
we are planning a similar activity. Our environment is huge with multiple indexers. We plan to separate the monitoring tools and production applications.
I plan to spin up new machines, install splunk add them to index cluster and before that update outputs.conf file with new machines.
We've 6 indexers on RHEL 6.7 running Splunk 6.5.4.
few questions I have are -
Thanks
I will not be able to answer your questions. I ended up going back to a single instance by installing a new 6.6.2 image of Splunk and copying the data to this new system. We only index about 6 GB a day and with the bugs that were cropping up severely affecting performance, the single instance works for us.
@nmohammed you should create a new post.
Sounds like it to me. Remember we can always turn the slow indexer back on if we need to.
I think you're ready for the next step now though.
Stop the slow one, change the RF back to 2, see what happens.
if indexers are clustered, enable maintenance mode before shutting the Indexer down
more on maintenance mode here:
https://docs.splunk.com/Documentation/Splunk/6.5.3/Indexer/Usemaintenancemode
hope it helps