Re: is it safe to delete the files and directories...

lmvmandadi · ‎07-11-2019

I am having issues with search head members not pushing changes to the captain .I read in one of the post to delete the the files and directories under splunk/var/run/ and then do a restart

jnudell_2 · ‎07-13-2019

Hi @lmvmandadi ,

The answer to your question of whether it's safe to delete is, "it depends". If you're in a SHC environment you might be able to remove items from that folder without some major impacts. However, given the questions and the responses in the thread you've posted, I would probably look at re-building that particular search head.
1. Stop the problem search head
2. Back up the /opt/splunk (or whatever $SPLUNK_HOME is associated to) folder
3. Move or delete the /opt/splunk folder
4. Install a clean copy of Splunk the same version as your other members
5. Add the clean install member to the SHC and let the SHC captain re-sync all the files

The downside to the above is that you will lose any changes that you made on that particular server (not lose permanently because we backed up all the config files, but the changes will need to be re-done on the clean copy).

I think this might be a better solution, given that you'll probably spend more time troubleshooting SHC issues, trying to resolve them and potentially introduce other problems into the SHC.

Hope this helps.

ddrillic · ‎07-11-2019

Most of the data should be under splunk/var/run/splunk/dispatch.

-- In the dispatch directory, a search-specific directory is created for each search or alert. Each search-specific directory contains several files including a CSV file of the search results, a search.log file with details about the search execution, and more. These are 0-byte files.

You can read about at Dispatch directory and search artifacts

The bottom of the page speaks about -

-- Clean up the dispatch directory based on the age of directories

tiagofbmm · ‎07-11-2019

What issues are you facing? Have you checked the internals for shc deployer pushing errors first? Deleting the search artifacts residing in /var/run/ won't necessarily help

lmvmandadi · ‎07-12-2019

From two days I am having search head clustering issuses " Search head cluster member (https://hesplsrhc003:8089) is having problems pushing configurations to the search head cluster captain .Changes on this member are not replicating to other members.

I tried to change the captain , done a rolling restart , ran the resync command but still have issues

tiagofbmm · ‎07-12-2019

You can enable more.agressive logging for the shc components and see what it says. Is your network connection between members ok? Can you check it?

lmvmandadi · ‎07-12-2019

Yes.I checked the logging .I see the below error and the connection between the members are fine

07-12-2019 14:50:57.458 -0400 ERROR ConfReplicationThread - Error pushing configurations to captain=https://hesplsrhc004:8089, consecutiveErrors=2333 msg="Error in acceptPush: Non-200 status_code=400: ConfReplicationException: Cannot accept push with outdated_baseline_op_id=3dfc93bbf15bcbb2d0c2c8b69d542d7d05181bb2; current_baseline_op_id=5d0509452c20f0c738813010a053ae57e4aefb64": Search head clustering: Search head cluster member (https://hesplsrhc002:8089) is having problems pushing configurations to the search head cluster captain (https://hesplsrhc0048089). Changes on this member are not replicating to other members.

tiagofbmm · ‎07-12-2019

Got ya. So that message is much more informative. Your SHC members need to inform the captain of changes they make, so he replicates them to the remaining ones. The problem is what your member is pushing is too far back compared to what the captain has. So you need to ensure there is a common baseline in all of them, meaning you need to resync them.

I'd start by splunk show shcluster-status and check the last_replication_conf in the one not the captain and compare to the captain. A manual resync of the members should then be done so they share a common commit.

Follow the doc here: https://docs.splunk.com/Documentation/Splunk/7.3.0/DistSearch/HowconfrepoworksinSHC#Perform_a_manual...

lmvmandadi · ‎07-12-2019

Thank you for your mail .I have the manual resync by running "splunk resync shcluster-replicated-config" but nothing has changed .I have ran this command from two days but no use

The last replication for all the members is last_conf_replication : Fri Jul 12 17:11:48 2019 which I think not an issue

lmvmandadi · ‎07-12-2019

I got the following error for one of the member

Downloaded an old snapshot created 91696 seconds ago; Check for clock skew on this member or the captain; If no clock skew is found, check the captain for possible snapshot creation failures

tiagofbmm · ‎07-12-2019

There's a parameter controlling when the changes are erased in server.conf: conf_replication_purge.eligibile_age. Its default is one day (86400 secs).
What do you mean "I have ran this command from two days" ?

lmvmandadi · ‎07-12-2019

I mean the manual resync command I have ran it two days ago and yesterday also ,but I still see the error .

Coming to the old snapshot thing.What changes can I make in order to make the resync with the latest time

tiagofbmm · ‎07-12-2019

You need to run that command in the member that had the issue, and you'd have "The member has been synced to the latest replicated configurations on the captain."
Is that what you've done? run the resync command on the member with the issue?

lmvmandadi · ‎07-13-2019

Yes I have done on the search head that had the issue .Once it showed it synced with the latest replication and some times it shows the clockskew error .I raised a ticket with splunk support

is it safe to delete the files and directories under splunk/var/run/ for search heads

Congratulations to the 2025-2026 SplunkTrust!

[Puzzles] Solve, Learn, Repeat: Nested loops in Event Conversion

Your Guide to Splunk Digital Experience Monitoring

Join the Conversation

is it safe to delete the files and directories under splunk/var/run/ for search heads

Congratulations to the 2025-2026 SplunkTrust!

[Puzzles] Solve, Learn, Repeat: Nested loops in Event Conversion

Your Guide to Splunk Digital Experience Monitoring