Deployment Architecture

Does splunk support search head pooling via clustered storage (gfs2)?

msarro
Builder

Hey everyone. In our company we're HEAVILY discouraged from using NFS because it's significantly less robust than our enterprise SAN (the SAN can lose several chassis with no effect on data access or integrity). Also, NFS has significantly more latency than a direct clustered filesystem. As such, our storage team asked that we implement search head pooling by utilizing a clustered filesystem (gfs2). Everything works fine with the clustered filesystem, but splunk keeps spitting out errors about being unable to achieve lock on the following:

Error in search head pooling validate-quiet: Failed to lock /splunk/etc/users/testpath with return code -1: No such file or directory There was an error validating your search head pooling configuration. For more information, run 'splunk pooling validate' Error fixing dangling data: Failed to lock /splunk/etc/apps/sentinel.txt with return code -1: Success There was an error preparing your conf files for search head pooling. For more information, run 'splunk btool find-dangling'.

When I run splunk pooling validate I get the following:

[root@bcscer-chi-s1 ~]# /opt/splunk/bin/splunk pooling validate

   Error in search head pooling validate: Failed to lock /splunk/etc/users/testpath with return code -1: No such file or directory

I opened a support ticket but want to check. Previously I recall seeing search head pooling as being compatible with clustered filesystems but I can't seem to track that down now. I could just be imagining things 🙂

Tags (1)
1 Solution

yannK
Splunk Employee
Splunk Employee

no, only nfs and CFIS (Samba) is currently supported for search-head pooling.

View solution in original post

oofaustoo
Explorer

I tackled this by creating a replicated glusterfs volume for all my peer nodes, and then locally mounted on each node via nfs. glusterd does it's thing in the background and Splunk just deals with the nfs share. With the nfs share being locally mounted, I was able to take aggressive mount options for rsize/wsize. On top of all that, I'm using mode=6 bonding (balance-alb) for the nics.

dwaddle
SplunkTrust
SplunkTrust

Clustered filesystem locking semantics have always been tricky. While not being officially supported, you might consider alternative clustered filesystems, like OCFS2, GPFS, or Veritas Cluster Filesystem. Any experience you can gain from getting them to work might be useful to Splunk from the perspective of figuring out what clustered filesystems to test and certify.

0 Karma

yannK
Splunk Employee
Splunk Employee

no, only nfs and CFIS (Samba) is currently supported for search-head pooling.

Masa
Splunk Employee
Splunk Employee

You are right about flock() calls. Splunk is using flock() signals for many reasons currently. Simply disabling it will cause other issues. To support gfs/gfs2 which does no t work general fslock() due to clustering, codes needs to be changed. Agree with filing an Enhancement Request.

0 Karma

jacobwilkins
Communicator

Splunk really needs to support some kind of enterprise solution. We can't get the performance we need out of NFS and we were hoping to try Veritas CFS, which is similar in operation to GFS2.

0 Karma

msarro
Builder

This is a shame - it really should be supported. We're going to open a feature request with our support rep and see if they can at least provide a way to disable their file locking since that seems to be what causes the issue.

0 Karma
Get Updates on the Splunk Community!

Take Your Breath Away with Splunk Risk-Based Alerting (RBA)

WATCH NOW!The Splunk Guide to Risk-Based Alerting is here to empower your SOC like never before. Join Haylee ...

Industry Solutions for Supply Chain and OT, Amazon Use Cases, Plus More New Articles ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Enterprise Security Content Update (ESCU) | New Releases

In November, the Splunk Threat Research Team had one release of new security content via the Enterprise ...