Hello @gjanders
Firstly Kudos to contributing with such a well written app with such good documentation, the code is so clean and easy to understand.
We are using an SH cluster (with 6 search Head nodes)
Our current way of backing up is to do a nightly commit from each SH (writing to its own git repo).
The implementation/feature of restoring Knowledge objects selectively from a backup seemed very promising, which is why we chose to evaluate your app.
I noticed the following FAQ question that the documentation addresses, considering SH clustering is very common - how would you suggest going about the configuration so as to be able to back up knowledge object. ( since the documentation is clearly issuing a warning against it)
Will this work on a search head cluster?
No, modular inputs run on each member at the same time which would not work well...however you can use a standalone server to backup/restore to a search head cluster.
You could also run the input on a single search head cluster member but this is not a recommended solution.
Additionally , our splunk url internally is behind an F5 Load balancer, that is to say that any user logging in would get re-directed to one of the 6 Search Head Nodes.
Regards,
Mukund M
Thanks for the question, while a modular input cannot run a search head cluster, it can run on an independent search head or heavy forwarder.
In regard to your question:
Additionally , our splunk url
internally is behind an F5 Load
balancer, that is to say that any user
logging in would get re-directed to
one of the 6 Search Head Nodes.
All 6 members of the search head cluster should be in-sync configuration wise, therefore it should not matter which search head is reached via the REST API / load balancer, they should all return the same configuration information.
So therefore as long as you have a non-clustered search head to use, for example the deployer server, you can use that server to run the backup/restore job against the search head cluster. In my environment I use the monitoring console server as it is standalone to backup/restore the main search head cluster.
"In fact you can run multiple inputs backing up and restoring multiple search head clusters from the 1 deployer if you wish, they would just need unique names and unique paths on the filesystem to store temporary files." yes, we set up modular inputs on the standalone instance, it works.
So does the app need to be installed on the remote SH?
Yes! The python code exists within the app, while you would not use the dashboard remotely you would configure the modular inputs...
So the inputs are configured on both the main install (in our case the deployer) and on a search head, is that correct?
Configure the input on the main install (deployer in your case), and not on the search head members...
Everything can be done remotely by design...
Thanks for the question, while a modular input cannot run a search head cluster, it can run on an independent search head or heavy forwarder.
In regard to your question:
Additionally , our splunk url
internally is behind an F5 Load
balancer, that is to say that any user
logging in would get re-directed to
one of the 6 Search Head Nodes.
All 6 members of the search head cluster should be in-sync configuration wise, therefore it should not matter which search head is reached via the REST API / load balancer, they should all return the same configuration information.
So therefore as long as you have a non-clustered search head to use, for example the deployer server, you can use that server to run the backup/restore job against the search head cluster. In my environment I use the monitoring console server as it is standalone to backup/restore the main search head cluster.
Thanks @gjanders for such a quick response.
We also have a standalone Deployer Server (a cluster master node), where the Monitoring Console is running.
We manage the Configs to be pushed to each Search Head Peer in a git repository, following PR's which when merged get pushed on to the Deployer Server via ansible scripts.
To understand better , how would the following scenario be handled.
Backup all knowledge Objects (Dashboards Alerts and Reports) created in the Searching & Reporting App.
As you correctly mentioned - All 6 members of the search head cluster should be in-sync, i.e. which means any knowledge objects created should be available across all the 6 nodes.
As per the following comment
In my
environment I use the monitoring
console server as it is standalone to
backup/restore the main search head
cluster.
How would the Deployer know about the Knowledge objects that need to be backed up from the Search Head.
Regards
Mukund M
Hi @mnm1987
Were you able to test out @gjanders solution? Did it work? If yes, please don't forget to resolve this post by clicking on "Accept". If you still need more help, please provide a comment with some feedback.
Thanks!
Hey @asiddique_splunk ,
I am in the process of documenting list of items/issues that were preventing us from moving forward.
Will update soon.
Hey @gjanders
The fallback provided of using a standalone Splunk Instance was good but drew in the following limitations/additional work to proceed.
Configuring the modular Input for Backup
Our Splunk Search Heads use SSO with OKTA (with users being mapped from Active Directory), for the purpose of being able to login ,resorted to creating a user (default Auth that comes with Splunk),all of this because we can't "useLocalAuth" and the fact that didn't want plain-text password for a logged in user to be shared anywhere.
srcURL : https://remote-splunksrch-instance/en-US/account/login?loginType=splunk
srcUsername : splunkVersionControl
srcPassword : "shows up in plain-text"
gitRepoURL :
our git repo user permissions are driven by Active Directory as well, although not a blocker - in order to proceed we need to tie in the same logged in user to be able to have permissions on the repo.
Similar issues would hold true for restoring.
Finally the SplunkVersionControl Restore dashboard - by default shows the knowledge objects from the localhost - splunk instance. This leads to some ambiguity ,since our aim is to restore Knowledge objects that were backed up from the remote splunk instance.
We definitely plan to revisit using app at a later time.
Again thanks for your inputs.
Regards
Mukund M
srcPassword : "shows up in plain-text"
Unfortunately the modular input framework by default has plain text passwords. I could use the splunk passwords.conf file but that would make things much more complicated as you would have to add passwords on 1 screen and create the modular input on another.
Or I'd have to write a bit of code to attempt to encrypt/decrypt the password in some way, either way it didn't fit my scenario.
EDIT: Ideally Splunk would add the password encryption here but they did not so I have not attempted to work around it.
gitRepoURL : our git repo user
permissions are driven by Active
Directory as well, although not a
blocker - in order to proceed we need
to tie in the same logged in user to
be able to have permissions on the
repo.
Ok, also not an issue in my environment, the code is open source and you are welcome to fork a copy and/or contribute a pull request if you choose to deal with this scenario.
Finally the SplunkVersionControl
Restore dashboard - by default shows
the knowledge objects from the
localhost - splunk instance. This
leads to some ambiguity ,since our aim
is to restore Knowledge objects that
were backed up from the remote splunk
instance.
You use the dashboard on the local instance where you want the object restored from, that updates a lookup file which the remote instance reads and executes against.
The important point here is that the dashboard is run on the instance where the end user wants something restored from, they would never need to know where the restore actually happens from...
Hey @gjanders , Thanks again for the clarifications, appreciate it.
The application requests a URL to be used to choose where to backup from or restore to, this URL can be set to localhost, or it can be set to another Splunk instance.
In your example the URL would be set to work with the search head cluster URL, therefore the modular input is running on the deployer server, but backing up and restoring to the remote search head cluster.
In fact you can run multiple inputs backing up and restoring multiple search head clusters from the 1 deployer if you wish, they would just need unique names and unique paths on the filesystem to store temporary files.