Deployment Architecture

After a RAID failure, why won't our search head cluster member start with error "could not parse raft entry file"?

nwales
Path Finder

After a RAID failure, the disks were supposedly not impacted and a fsck has been completed, but since then, I am seeing the below in splunkd_stderr.log each time I try to start. Seems to be fatal as there are no splunk threads running afterwards.

2015-04-07 16:44:41.288 -0500 splunkd started (build 245427)
terminate called after throwing an instance of 'std::runtime_error'
  what():  could not parse raft entry file

Any idea how I can fix this?

0 Karma

splunkapprentic
Explorer

If you have this problem on a SH node, you can use the official Fix Raft issues on a member:

To fix a Raft issue, clean the member's _raft folder. Run the splunk clean raft command on the member:

Stop the member:

splunk stop

Clean the member's raft folder:

splunk clean raft

Start the member:

splunk start

The _raft folder will be repopulated from the captain

this is described in the documentation:

https://docs.splunk.com/Documentation/Splunk/7.1.2/DistSearch/Handleraftissues

0 Karma

maraman_splunk
Splunk Employee
Splunk Employee

Hi,

Had the same pb after a filesystem full situation on the filesystem where splunk/var was.
After freeing space, splunk would crash at start
in $SPLUNK_HOME/var/lib/splunk, latest file are a crash file and splunk_stderr
splunk_stderr contains "could not parse raft entry file"
mv $SPLUNK_HOME/var/run/splunk/_raft $SPLUNK_HOME/var/run/splunk/_raft_KO followed by restart fixed it
you need to restart another time again to have no warning by the init script
splunk shcluster-status is also all good again.

0 Karma

esix_splunk
Splunk Employee
Splunk Employee

You can remove the whole $splunk_home$/var/run/splunk/raft/* structure on that instance and restart it. As long and it's SHC configuration is still valid, when it restarts it will join the SHC. It should automatically rebalance out, if not, a rolling-restart should fix it.

ppohar
Explorer

Comparing $SPLUNK_HOME/var/run/splunk/raft/server*.local8089/log with other cluster members and replacing complete log directory from working cluster member should fix this problem.

0 Karma
Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) | New Releases

In December, the Splunk Threat Research Team had 1 release of new security content via the Enterprise Security ...

Why am I not seeing the finding in Splunk Enterprise Security Analyst Queue?

(This is the first of a series of 2 blogs). Splunk Enterprise Security is a fantastic tool that offers robust ...

Index This | What are the 12 Days of Splunk-mas?

December 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...