Hi!
I have two VMs with Splunk that I want to make an indexer cluster out of. The VMs are almost identical, but the partitioning are somewhat different on the two machines. On one of them, Splunk runs on a filesystem /dev/folders/centos-splunk
mounted on /opt/splunk
, while on the other I think Splunk runs on a filesystem /dev/folders/centos-root
mounted on /
. I'll attach the output from df -H
on the two machines. Does anyone know if this will cause any problems when initiating the indexer cluster?
Any help would be much appreciated! Thanks!
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/centos-root 54G 4.0G 50G 8% /
/dev/folders/centos-splunk 496G 461G 35G 93% /opt/splunk
/dev/sda1 521M 151M 371M 29% /boot
Filesystem Size Used Avail Use% Mounted on
/dev/folders/centos-root 54G 31G 24G 57% /
/dev/sdc1 521M 151M 371M 29% /boot
/dev/folders/centos-home 496G 416G 81G 84% /home
Best regards,
Martin
In both cases, Splunk will be installed in /opt/splunk
, so it shouldn't matter at all. Splunk doesn't look at the underlying device it's being installed onto.
That said, it's generally best to make /opt/splunk/var/lib/splunk
a separate file system, so that when you upgrade, you can just use tar to make a quick backup of $SPLUNK_HOME without grabbing the index files. Having this tar file will make roll-back from a problematic upgrade very easy.
Given how much data is in use in your examples, I would question whether that machine will be performant enough for Splunk. How much data are you indexing? How many users are searching the data? Or are these already in use as indexers?
If they are in use as indexers, I would not advise converting them to a cluster, but building a cluster on new instances. The configuration changes necessary for index clustering means that all configurations on all indexers need to be consistent. Having non-clustered data and configs on the indexers will make building the cluster a lot more difficult than the fact that the file systems have different mount points.
In both cases, Splunk will be installed in /opt/splunk
, so it shouldn't matter at all. Splunk doesn't look at the underlying device it's being installed onto.
That said, it's generally best to make /opt/splunk/var/lib/splunk
a separate file system, so that when you upgrade, you can just use tar to make a quick backup of $SPLUNK_HOME without grabbing the index files. Having this tar file will make roll-back from a problematic upgrade very easy.
Given how much data is in use in your examples, I would question whether that machine will be performant enough for Splunk. How much data are you indexing? How many users are searching the data? Or are these already in use as indexers?
If they are in use as indexers, I would not advise converting them to a cluster, but building a cluster on new instances. The configuration changes necessary for index clustering means that all configurations on all indexers need to be consistent. Having non-clustered data and configs on the indexers will make building the cluster a lot more difficult than the fact that the file systems have different mount points.
Thanks a lot for a detailed and well thought out answer!
I had not thought of putting the indexers in a seperate filesystem, smart! That being said, I'm still a bit novice in dealing with partitions. I can't quite wrap my head around the concepts, but I'll get there.
Yes, the machines are not ideal. I hope that we'll get some better ones when upgrading. Then again, someone will have to take the cost, us or the customer. You know how it is.
The indexers are already in use, but this shouldn't be a problem, right? The only "problem" is that data already indexed on the indexers aren't clustered, as far as I know. Hm. I'll have to look into this as well. Thanks.
Clustered indexers have a 'cluster master' (CM) that manage their configuration files. So a hybrid set-up is going to complicate things with respect to configuration file precedence and overall management. Ideally, you'd migrate the existing configurations to the CM.
If you're going to convert, I would suggest setting up a set of small test VMs with IDXes having existing data, and convert them to familiarize yourself with the process. Testing is going to be especially important if you want to migrate your existing configurations to the cluster, since I doubt that the two IDXes have consistent configurations right now, given the partitioning scheme.
As for the partitioning aspect, do you have a UNIX system administrator in your organization? If so, you might want to sit down with him/her and discuss your requirements. Since you are in a VM situation, you could get a small-ish partition (20GB or so), and mount that at a temporary mount point, move /opt/splunk
excluding /opt/splunk/var/lib
to the new, smaller partition (or copy to new and then delete old). Then change /etc/fstab
to mount the new partition at /opt
; mount the existing partition at /opt/splunk/var/lib
.
Essentially, *NIX doesn't care where a partition is mounted; you just need to make sure when you set the fsck order that partitions that are closer to root are have lower fsck numbers, since the system can't mount the partition until the fsck is complete.
Is it possible to build new systems with consistent partitioning and migrate the data over to them? That might be cleaner, overall. It would consume temporary resources but allow you to release the existing resources when the migration is completed.
The right answer I think is don't do that. The practical answer is it depends. As long as you only use references to $SPLUNK_HOME and $SPLUNK_DB in configurations that get distributed it might work.
If you can't move things around to normalize it you might be able to try symbolic links or something like that to symbolically normalize it to the same path.