I have ~2 months experience with Splunk so far, so my apologies if this is a dumb question:
Can a custom deployment app remove $SPLUNK_HOME/etc/instance.cfg?
Additional background:
We have >3,000 deployment clients, and ~600 of them do not have unique Client IDs. To fix this, I found that we need to simply remove $SPLUNK_HOME/etc/instance.cfg and then restart splunk. Instead of hunting down the countless Server Admins for each of those hosts, I was hoping we could accomplish this via a Deployment App. To prevent the app from repeatedly performing this on the same host, maybe I'd have to implement logic similar to this:
if [[ ! -e /opt/splunkforwarder/etc/instance.cfg.dup_guid ]]; then
mv /opt/splunkforwarder/etc/instance.cfg /opt/splunkforwarder/etc/instance.cfg.dup_guid
/opt/splunkforwarder/bin/splunk restart
fi
I guess it might be tedious when I have to manually add the ~600 affected hosts to my custom app's Server Class, but I still think this will be easier/quicker than hunting down the Server Admins.
Appreciate your thoughts.
The config files in an app cannot overwrite instance.cfg. You can, however, deploy a scripted input in an app and have that scripted input delete instance.cfg and restart Splunk. Your existing shell script should work fine as a scripted input. Be sure to configure the input to run only when Splunk starts (interval=-1).
Thanks! As described above, I went ahead and created guidcleanup.sh like so:
#!/bin/bash
if [[ ! -e /opt/splunkforwarder/etc/instance.cfg.dup_guid ]]; then
mv /opt/splunkforwarder/etc/instance.cfg /opt/splunkforwarder/etc/instance.cfg.dup_guid
/opt/splunkforwarder/bin/splunk restart
fi
To test this script on a forwarder, I tried using the following commands:
cd /opt/splunkforwarder/bin; ./splunk cmd ../etc/apps/guid_cleanup/bin/guidcleanup.sh
...but it failed. It revealed that our previous Splunk admins have been rolling out Splunk forwarders with splunkd running as root!
To fix this particular forwarder, I stopped splunkd, recursively chowned /opt/splunkforwarder to splunk:splunk, then restarted splunkd. After doing this, the above commands ran successfully.
At this point, I'm having difficulty figuring out how to get Splunk to automatically execute guidcleanup.sh (even after I remove instance.cfg.dup_guid.)
Here's some info from the forwarder:
-bash-4.2$ ls -l ~/etc/apps/guid_cleanup/bin/*sh
-rwx------ 1 splunk splunk 276 Oct 18 14:40 /opt/splunkforwarder/etc/apps/guid_cleanup/bin/guidcleanup.sh
-bash-4.2$ grep guidcleanup.sh ~/var/log/splunk/splunkd.log*
-bash-4.2$ id
uid=3003(splunk) gid=44399(splunk) groups=44399(splunk)
Here is the forwarder's ~/etc/apps/guid_cleanup/local/inputs.conf:
[script://./bin/guidcleanup.sh]
disabled = false
index = main
interval = -1
Unfortunately, this didn't improve things, either:
[script://$SPLUNK_HOME/etc/apps/guid_cleanup/bin/guidcleanup.sh]
disabled = false
index = main
interval = -1
How do I get Splunk to execute guidcleanup.sh for me?
Lastly, I'm guessing the vast majority of the 600 forwarders have splunkd running as root. If this is the case, then would I be able to run "chown -R splunk:splunk /opt/splunkforwarder" as a scripted input? If not, then I guess there really is no other option than having the Server Admins address these issues themselves.
Thanks again for your input.