Deployment Architecture

Searchhead Pooling: Failed to lock sentinel.txt while saving/deleting search from GUI

bwooden
Splunk Employee
Splunk Employee

Red banner message in GUI (below) regarding sentinel.txt file lock is preventing updates from the GUI.

Error fixing dangling data: Failed to
lock
/mnt/search_head_pool/etc/apps/sentinel.txt
with return code 1: Success

I do not presently have access to the logs of the system (Splunk 4.2.2 101277 on RHEL).

  • Can sentinel.txt and sentinel.txt.lock safely be deleted?
  • Are there known causes/resolutions for this error?
  • What is the scope/impact of this state?
1 Solution

ewoo
Splunk Employee
Splunk Employee

Yes, you can delete these files while Splunk is stopped. They are re-created on demand.

If you see a "stale" sentinel.txt.lock file remaining while Splunk is stopped, that is probably the source of this error.

What is the output of "splunk pooling validate"?

View solution in original post

ewoo
Splunk Employee
Splunk Employee

Yes, you can delete these files while Splunk is stopped. They are re-created on demand.

If you see a "stale" sentinel.txt.lock file remaining while Splunk is stopped, that is probably the source of this error.

What is the output of "splunk pooling validate"?

ewoo
Splunk Employee
Splunk Employee

The most common reasons are mentioned in the comment immediately preceding yours: 1) stale lock file (caused by a crash, for example), or 2) poor performance of shared storage, leading to slow I/O and contention on the lock file.

Some improvements to splunkd were made to reduce the amount of I/O we perform against sentinel.txt; these improvements landed in 5.0.6 and 6.0 (SPL-66563)

0 Karma

splunkIT
Splunk Employee
Splunk Employee

@ewoo, in what circumstances this "failed to lock sentinel.txt" error will occur?

0 Karma

ewoo
Splunk Employee
Splunk Employee

The error is displayed if:

1) the user triggers an action that requires writing to a conf file, and
2) the write fails when the user cannot acquire a file-based mutex

You can't suppress the error. The underlying failure must be addressed -- remove stale lock files, investigate contention on the lock file and/or performance of shared storage, etc.

0 Karma

the_wolverine
Champion

Why would this error be displayed to the user? Can it be suppressed?

0 Karma

ewoo
Splunk Employee
Splunk Employee

This file is only created/used when pooling is enabled.

The file itself acts as the synchronization mechanism for conf writes. In other words, a member of the SHP must "own" this lockfile in order to make conf changes. If a member of the pool X finds the lockfile already owned by another member Y, X will wait for Y to relinquish ownership of the lockfile.

rmorlen
Splunk Employee
Splunk Employee

How does Splunk handle this file in Pooling mode? Which server gets the "lock"? What happens when multiple servers need to lock the file?

0 Karma
Get Updates on the Splunk Community!

Earn a $35 Gift Card for Answering our Splunk Admins & App Developer Survey

Survey for Splunk Admins and App Developers is open now! | Earn a $35 gift card!      Hello there,  Splunk ...

Continuing Innovation & New Integrations Unlock Full Stack Observability For Your ...

You’ve probably heard the latest about AppDynamics joining the Splunk Observability portfolio, deepening our ...

Monitoring Amazon Elastic Kubernetes Service (EKS)

As we’ve seen, integrating Kubernetes environments with Splunk Observability Cloud is a quick and easy way to ...