Since it's a best practice to install Splunk and run it as a non-root UNIX user, how can I make sure Splunk has the necessary read permissions for the files it needs to monitor?
This UNIX challenge makes it all too tempting to install as root...
Yes, running as a nonroot user is a best practice. In fact, there's an entire documentation page on it at Run Splunk Enterprise as a different or non-root user.
On the other hand, when users stop seeing their data in Splunk, they are quick to blame Splunk...even when it's the result of their removing of the needed read permissions to the relevant data.
If you don’t have a lot experience with UNIX permissions, then you need to do some independent learning because the following information depends on you knowing the basics.
Imagine we have a densely packed web server. By this, I mean there are tens of instances of Apache web server running simultaneously. While this design might seem strange given the proliferation of containers and advanced networking, the fact is that many enterprises still have such designs laying around from the web2.0 era before virtualization or even cloud had such a resurgence.
Each web server instance is running as a different user id with its configuration and binary files owned by that same id. This design is intentional to constrain the potential impact as various teams log in to the shared server and need to manage their instance but not impact any other teams'.
For our scenario, we will pretend that
webteam1 is running their web server as user
webteam, installed to
webteam owns the files and directory.
As the Splunk admin, you need to set up a monitor of the log files for all web servers running on that server. Each UNIX user account creates and owns separate log files, and Splunk is running as 'splunk' user. Permission challenges ensue and this is where many engineers simply run Splunk as root to circumvent the permissions issues...which is a no-no.
Recall that UNIX has three permission categories, or modes: user, group, and other. You can give each category read, write, and/or execute permissions.
To make our scenario more challenging, all web team user accounts have the same group, but your Splunk ID is not part of that group. Making things worse, security compliance restricts any read access to the 'other'. The result is that Splunk cannot read the data because it's account is not the user owner, a member of the group, and there are no other read permissions allowed.
As you might have figured out, a simple solution for this is to simply add 'splunk' as a member of the group. If you're a new UNIX user, it may not be obvious because this secondary group is not listed as the group when the splunk account creates new files or directories, that one is the primary group. Nonetheless, as a member of the group, the 'splunk' account can now read the needed data.
Before you run off, pay attention to the directories. Folks often forget that for users to consume the contents of files within a directory, the directory must provide execution permission. If Splunk cannot read a file, yet the file's permissions appear correct, review the directory's permissions to make sure the execution permission is set for the group as well.
I understand there might be more advanced approaches possible. Things like group Access control list, cron with scripts, umasks, or maybe things I haven't considered and never learned about. I've kept the discussion to the basics to accommodate the fact that the more advanced the solution gets, the less likely someone in a rush is to solve it...and instead simply revert to the devil we know, running as root.
With that in mind, all are encouraged share ways to strengthen this approach knowing our collective challenge: keep the solutions simple, common for effectively all UNIX flavors, and durable.
We created a Splunk group and follow this same basic line of thought.
Splunk is the user name, Splunk is the primary group that user belongs to. Splunk runs under user Splunk. Files, et al that need to monitored are either placed in a folder where Splunk group has read permissions or,.. We chmod folder the files are in to give the Splunk group those permissions.
You can get much more granular than that even.
I love it!
So that means you
chown to add the splunk group, right? Cause if a different group owns then splunk group would fall into the 'other' perms, right?
In an enterprise situation, installing Splunk as root is not even a remote option.
-- Since it's a best practice to install Splunk and run it as a non-root UNIX user, how can I make sure Splunk has the necessary read permissions for the files it needs to monitor?
The Splunk install and read access to the logs are, in my mind, are two different subjects. About read access, we can either allow
other to have read access to the files or better yet is to make the Splunk user part of the
group which owns the file and ensure that this group has read access to the file.
File Permissions in Linux/Unix with Example explains these three entities -
Agreed. The challenge I've often run into is that the
other group can't have read access because it's too permissive. So given that constraint combo of nonroot install plus no read to
other I tried to focus on how we've solved this with the
group permissions. Given those constraints, would you agree with what's outlined in the other answer? I ask to make sure we don't overlook or trivialize the essential details you are sharing with us.
@SloshBurch - no doubt - in an enterprise setting the
group way is the standard way. In these cases, various application teams own these
groups and we, as the splunk team, keep requesting to add the splunk id to these various
groups. Then we run into another problem which is the fact that the splunk id belongs to too many
groups and in Solaris it was so tricky that we ended up restoring to the