Anyone using a SHC (Search Head Cluster) implements apps from the Deployer. The deployer collapses the local and default config directories into default and pushes the config to the SHC members.
After normal usage, some of the knowledge objects in the app have evolved (like a savedsearch or a macro has been modified).
Eventually a new version of the app comes out and it has a butt kickin' nice new version of that very knowledge object. So, I stage the new version of the app on the deployer and push it out.
Unfortunately, the local folder edit of the knowledge object still takes precedent and the sweet new version (sitting in the default directory on the SHC members) is ignored.
How do we eliminate our version of the config and revert back to the one in the local directory?
Since we can't delete the edited version of the knowledge object from the UI, and we can't manually edit the conf file, what is the recommended way to address this?
If you use the Deployer to send the splunk_app_aws to your Search Head Cluster you'll then have a bunch of cool knowledge objects that you can edit. Let's pretend I want to edit
aws-accesslog-sourcetype(1) to adjust it for my environment. Before the edit, this config lives ONLY in
$SPLUNK_HOME/etc/apps/splunk_app_aws/default/macros.conf on the Search Head Cluster Members. I make the my change in the UI and the result is this:
Notice my new definition of
blah but no way to delete or revert back. There is now a corresponding version of this macro on the Search Head Cluster Members in
$SPLUNK_HOME/etc/apps/splunk_app_aws/local/macros.conf defined as
Now let's pretend that after some time, I want to remove my change and go back to the version provided in
$SPLUNK_HOME/etc/apps/splunk_app_aws/default/macros.conf - with a single search head OR a search head pool, I can simply remove the corresponding stanza in
$SPLUNK_HOME/etc/apps/splunk_app_aws/local/macros.conf with a text editor and restart the instance thereby allowing the version in
$SPLUNK_HOME/etc/apps/splunk_app_aws/default/macros.conf to take affect.
Unfortunately, you cannot make manual edits to configuration in a Search Head Cluster. So is there a parallel way to remove your
$SPLUNK_HOME/etc/apps/splunk_app_aws/local/macros.conf version in a Search Head Cluster?
You can use deployer_push_mode = full | merge_to_default | local_only | default_only
You have add it under [shclustering] stanza in app.conf of the app you are deploying. I think you have to have Splunk 7.3.0 and higher.
Here's the doc.
What we do is disallow (by policy mostly) editing the production version of the apps. Users can work in personal space but not promote anything into the production apps on the production Search Head. All development is done on a dev search head and we have a packaging script that does stuff like set the version number in
YYYY.MM.DD, enable scheduled searches that are disabled but have a "ready for production" string in the
description section, removes backups from the
Lookup File Editor, etc. So we dev in dev, and extract from dev, productionize with a script, then push out from the deployer.
Since the KO sync happens after you press save in the UI, you can delete the KO manually at any time from each search head and hit debug/refresh endpoint on each SH after the change.
I’ve done this many times without issue.
I assume speed is important here. If you are slow to modify each SHCmember I would expect sync issues to percolate. This is only important here because the SHs are running and talking to the captain and each other.
If the instances are offline that won't be of concern.
But yes, this approach makes sense as working so thanks for sharing!
No, timing doesnt really matter.
You're deleting the local copy, which will then still be referenced in memory and doesnt trigger a replication to occur because the file is missing on the filesystem not in memory.
So then, if you delete the local copy on all your search heads, and then proceed to hit debug/refresh, no replication occurs. You havent made any edits or created new config files to cause a replication event to be triggered.
If you need to see for yourself, read it here:
"The cluster does not replicate any configuration changes that you make manually, such as direct edits to configuration files."
This is always a problem with Splunk whereby from Deployer it pushes to "default" in SH members. What we do is
- Strict Control for End Users by process and roles: They cannot create dashboards/config elements on Apps. But only on personal space. If anything to be deployed to App, it should come to Splunk Platform team and will be deployed via deployer
- In case if you have a local entry, the splunk engineer should merge this after carefully looking into source-control + deployer. Then sh-deploy and wait for it to finish. Then STOP all SH members , delete it from all at same time from local, clear raft. Restart and redeploy again from deployer to make it 100% sure.
But if you have this problem already, the only reliable workaround I've found is
- STOP all the search heads at the approximate time.
- Take backup of files in "local" directory. i.e. /opt/splunk/etc/apps//local/.conf
- Ensure all SEarch heads are stopped .( Yes, I know this is impacting.. but hey.. its SHC pain)
- Remove the local file itself from all search heads. Dont edit it.
- Start search heads and do a redeploy from deployer
please note, this was written in 2017 when it was Splunk 6.2.x. At that time, if you fix on one server at a time, by the time you restart the other, the replicated settings used to come and reset the work you have done. Hence had to stop all
I'm not sure how it is with Splunk 7.x or 8.x
You don't have to stop them all in any version of splunk with SHC.
I've used my method since SHC was GA.
I didn't downvote but your answer causes undue harm if anyone follows it. That's like biggest criteria for a downvote but you're koshyk and I'd rather give you the opportunity to remove or modify.
The ways I can think of would be to either:
1. Stop all SHC members, remove the local folder for the app, start them up again.
2. Move the app off the SHC deployer, let the SHC delete the app from themselves, redeploy.
3. I've never tried this, but maybe it could work: instead of using the SHC deployer, deploy everything to the SHC captain from a regular deployment server or manually and see if it'll replicate everything across the rest of the members anyways.
Yea, but I don't think any of us are really satisfied with those, right?
Essentially, option 2 is the best but I've also submitted a feature request for this. Essentially, the non-SHC usage of splunk allowed this scenario to be addressed trivially and as such, no deliberate feature needed to be created to support it.
@SloshBurch, which option did you choose? I'm having the same issue, and am not sure how to proceed. I am tempted to stop all SCH members and delete the KOs. However, I worry about the sync issues you mentioned above. Thanks.