 
					
				
		
env:
Splunk 6.4.1 environment (all linux OS based):
3x index cluster peers
1x cluster master
1x deployer/license master
3x search head cluster peers
2x heavy forwarders
pre-checks:
https://www.splunk.com/pdfs/technical-briefs/splunk-deploying-vmware-tech-brief.pdf (in particular it references not to use vmware snapshots) <- unsure of date of this document.
Question:
Are there any definitive guidelines on whether using VMware snapshots is supported for Splunk indexer clusters?
This is not likely to be an ongoing requirement but rather on an ad-hoc basis (pre-upgrade for example).
Any pointers v.much appreciated.
Thx
Bry
 
					
				
		
There are two situtions that snapshots could be run:
A running host
Splunk is constantly writing to hot buckets on the indexers. Taking a snapshot could cause corruption in any hotbuckets upon restoration.
A stopped host
This could work, if all hot buckets have been rolled to warm first. See https://wiki.splunk.com/Community:BestPracticesForBackingUp
Something to keep in mind is the extra overhead of having a snapshot on a very busy host such as an indexer. A snapshot causes all changes to be tracked against a prior snapshot(s). This can get very expensive quickly.
I'd recommend following Splunk's advice and not snapshotting the hosts. You'll be better served with a file level approach of backing up configurations and data.
 
		
		
		
		
		
	
			
		
		
			
					
		It's not that it is not "supported", rather, it's not a good idea for performance reasons. If you are doing it at a time that has low activity (particularly incoming writes), than it shouldn't be a problem, as long as you do what you need to and then remove the snapshot.
As the author of that Tech Brief, I can tell you that all of the stuff in there are recommendations/best practices for scale and performance. Because VMware's snapshots create a new file on disk which Write I/O is routed to, the largest performance issue is the consolidation of the snapshot (where it has to merge all the changes from the snapshot). Also, by nature of backups, you are also putting a much higher Read I/O load on the disk in addition to Splunk's own Read I/O load. This is why I suggest you do it at point where activity is low (assuming there is one).
The biggest benefit to snapshots is the fact that you capture hot buckets in a consistent manner (at least as it relates to a single Indexer). If you are on a SAN that offers snapshots that are integrated with VMware through the appropriate VAAI primitives, then most of the I/O heavy lifting is offloaded to the storage, and consistency is likely across all Indexers stored on that array.
Of course, the easy answer is to use the Splunk recommended best practices for backing up 🙂
 
					
				
		
hi sdvorak
thank you for responding, especially being the author of the tech brief I referenced. very interesting to know that we in fact could use snapshots.
my query was triggered by a requirement for a version upgrade, so having to take all splunk clusters offline (index/search) - all testing has shown that no underlying splunk index data is affected and I guess this then lends itself to snapshots as they will be offline.
again thanks for your response and clarification of splunk + snapshots, it has been noted.
Thx
Bry
 
					
				
		
There are two situtions that snapshots could be run:
A running host
Splunk is constantly writing to hot buckets on the indexers. Taking a snapshot could cause corruption in any hotbuckets upon restoration.
A stopped host
This could work, if all hot buckets have been rolled to warm first. See https://wiki.splunk.com/Community:BestPracticesForBackingUp
Something to keep in mind is the extra overhead of having a snapshot on a very busy host such as an indexer. A snapshot causes all changes to be tracked against a prior snapshot(s). This can get very expensive quickly.
I'd recommend following Splunk's advice and not snapshotting the hosts. You'll be better served with a file level approach of backing up configurations and data.
 
					
				
		
hi beautus
thanks - as i say, not looking at this being anything other than an ad-hoc thing but totally agree, not worth deviating away from splunk recommendations here. will just run up rsync for now on warm buckets 🙂
Thx
Bry
