Getting Data In

Which UNIX permissions are best for monitoring files?

sloshburch
Splunk Employee
Splunk Employee

Since it's a best practice to install Splunk and run it as a non-root UNIX user, how can I make sure Splunk has the necessary read permissions for the files it needs to monitor?

This UNIX challenge makes it all too tempting to install as root...

1 Solution

sloshburch
Splunk Employee
Splunk Employee

The Splunk Product Best Practices team provided this response. Read more about How Crowdsourcing is Shaping the Future of Splunk Best Practices.

Yes, running as a nonroot user is a best practice. In fact, there's an entire documentation page on it at Run Splunk Enterprise as a different or non-root user.

On the other hand, when users stop seeing their data in Splunk, they are quick to blame Splunk...even when it's the result of their removing of the needed read permissions to the relevant data.

If you don’t have a lot experience with UNIX permissions, then you need to do some independent learning because the following information depends on you knowing the basics.

Scenario

Imagine we have a densely packed web server. By this, I mean there are tens of instances of Apache web server running simultaneously. While this design might seem strange given the proliferation of containers and advanced networking, the fact is that many enterprises still have such designs laying around from the web2.0 era before virtualization or even cloud had such a resurgence.

Each web server instance is running as a different user id with its configuration and binary files owned by that same id. This design is intentional to constrain the potential impact as various teams log in to the shared server and need to manage their instance but not impact any other teams'.

For our scenario, we will pretend that webteam1 is running their web server as user webteam, installed to /opt/webserver/webteam1 where webteam owns the files and directory.

As the Splunk admin, you need to set up a monitor of the log files for all web servers running on that server. Each UNIX user account creates and owns separate log files, and Splunk is running as 'splunk' user. Permission challenges ensue and this is where many engineers simply run Splunk as root to circumvent the permissions issues...which is a no-no.

Permission Constraints

Recall that UNIX has three permission categories, or modes: user, group, and other. You can give each category read, write, and/or execute permissions.

To make our scenario more challenging, all web team user accounts have the same group, but your Splunk ID is not part of that group. Making things worse, security compliance restricts any read access to the 'other'. The result is that Splunk cannot read the data because it's account is not the user owner, a member of the group, and there are no other read permissions allowed.

Secondary Groups FTW

As you might have figured out, a simple solution for this is to simply add 'splunk' as a member of the group. If you're a new UNIX user, it may not be obvious because this secondary group is not listed as the group when the splunk account creates new files or directories, that one is the primary group. Nonetheless, as a member of the group, the 'splunk' account can now read the needed data.

Files vs Directories

Before you run off, pay attention to the directories. Folks often forget that for users to consume the contents of files within a directory, the directory must provide execution permission. If Splunk cannot read a file, yet the file's permissions appear correct, review the directory's permissions to make sure the execution permission is set for the group as well.

Advanced

I understand there might be more advanced approaches possible. Things like group Access control list, cron with scripts, umasks, or maybe things I haven't considered and never learned about. I've kept the discussion to the basics to accommodate the fact that the more advanced the solution gets, the less likely someone in a rush is to solve it...and instead simply revert to the devil we know, running as root.

With that in mind, all are encouraged share ways to strengthen this approach knowing our collective challenge: keep the solutions simple, common for effectively all UNIX flavors, and durable.

View solution in original post

mbw
Observer


I am a unix admin for many years and understand unix/linux permissions well

What I am seeing on Splunk Server version 8.2.6 is that the splunk local file "tailing processor" was not able to read a file
that was set as sysadmin:adm and 640

We have a file called "/var/log/vpn.log" that is produced and populated by rsyslog

The file has permissions like this:

uid=1001(splunk) gid=1001(splunk) groups=1001(splunk),4(adm)
root@ika:/all/scripts# ls -la /var/log/vpn*
-rw-r----- 1 syslog adm 2081292 Nov 30 12:37 /var/log/vpn.log


if I do a "su - splunk" to become the splunk user, I can for sure read that vpn.log file
But the tailing process still cant read it unless I set it to world (other) readable

In order to get the tailing process to read the vpn.log file, I had to also add the user "splunk" to the group "sysadmin" which has read permissions on the containing directory /var/log

you'd think that containing directory (/var/log) blocked permissions wouldnt let a world-readable file be accessed either, so something weird is going on


Once I added splunk to the syslog group (was already in the adm group) so the directory permissions worked, then splunk tailing process could read the vpn.log file that wasnt world readable

Maybe I dont understand unix permissions as well as I thought - does anyone else find this strange?

0 Karma

ddrillic
Ultra Champion

In an enterprise situation, installing Splunk as root is not even a remote option.

-- Since it's a best practice to install Splunk and run it as a non-root UNIX user, how can I make sure Splunk has the necessary read permissions for the files it needs to monitor?

The Splunk install and read access to the logs are, in my mind, are two different subjects. About read access, we can either allow other to have read access to the files or better yet is to make the Splunk user part of the group which owns the file and ensure that this group has read access to the file.

File Permissions in Linux/Unix with Example explains these three entities -

alt text

0 Karma

sloshburch
Splunk Employee
Splunk Employee

Agreed. The challenge I've often run into is that the other group can't have read access because it's too permissive. So given that constraint combo of nonroot install plus no read to other I tried to focus on how we've solved this with the group permissions. Given those constraints, would you agree with what's outlined in the other answer? I ask to make sure we don't overlook or trivialize the essential details you are sharing with us.

0 Karma

ddrillic
Ultra Champion

@SloshBurch - no doubt - in an enterprise setting the group way is the standard way. In these cases, various application teams own these groups and we, as the splunk team, keep requesting to add the splunk id to these various groups. Then we run into another problem which is the fact that the splunk id belongs to too many groups and in Solaris it was so tricky that we ended up restoring to the other solution.

0 Karma

sloshburch
Splunk Employee
Splunk Employee

The Splunk Product Best Practices team provided this response. Read more about How Crowdsourcing is Shaping the Future of Splunk Best Practices.

Yes, running as a nonroot user is a best practice. In fact, there's an entire documentation page on it at Run Splunk Enterprise as a different or non-root user.

On the other hand, when users stop seeing their data in Splunk, they are quick to blame Splunk...even when it's the result of their removing of the needed read permissions to the relevant data.

If you don’t have a lot experience with UNIX permissions, then you need to do some independent learning because the following information depends on you knowing the basics.

Scenario

Imagine we have a densely packed web server. By this, I mean there are tens of instances of Apache web server running simultaneously. While this design might seem strange given the proliferation of containers and advanced networking, the fact is that many enterprises still have such designs laying around from the web2.0 era before virtualization or even cloud had such a resurgence.

Each web server instance is running as a different user id with its configuration and binary files owned by that same id. This design is intentional to constrain the potential impact as various teams log in to the shared server and need to manage their instance but not impact any other teams'.

For our scenario, we will pretend that webteam1 is running their web server as user webteam, installed to /opt/webserver/webteam1 where webteam owns the files and directory.

As the Splunk admin, you need to set up a monitor of the log files for all web servers running on that server. Each UNIX user account creates and owns separate log files, and Splunk is running as 'splunk' user. Permission challenges ensue and this is where many engineers simply run Splunk as root to circumvent the permissions issues...which is a no-no.

Permission Constraints

Recall that UNIX has three permission categories, or modes: user, group, and other. You can give each category read, write, and/or execute permissions.

To make our scenario more challenging, all web team user accounts have the same group, but your Splunk ID is not part of that group. Making things worse, security compliance restricts any read access to the 'other'. The result is that Splunk cannot read the data because it's account is not the user owner, a member of the group, and there are no other read permissions allowed.

Secondary Groups FTW

As you might have figured out, a simple solution for this is to simply add 'splunk' as a member of the group. If you're a new UNIX user, it may not be obvious because this secondary group is not listed as the group when the splunk account creates new files or directories, that one is the primary group. Nonetheless, as a member of the group, the 'splunk' account can now read the needed data.

Files vs Directories

Before you run off, pay attention to the directories. Folks often forget that for users to consume the contents of files within a directory, the directory must provide execution permission. If Splunk cannot read a file, yet the file's permissions appear correct, review the directory's permissions to make sure the execution permission is set for the group as well.

Advanced

I understand there might be more advanced approaches possible. Things like group Access control list, cron with scripts, umasks, or maybe things I haven't considered and never learned about. I've kept the discussion to the basics to accommodate the fact that the more advanced the solution gets, the less likely someone in a rush is to solve it...and instead simply revert to the devil we know, running as root.

With that in mind, all are encouraged share ways to strengthen this approach knowing our collective challenge: keep the solutions simple, common for effectively all UNIX flavors, and durable.

swagner1965
Path Finder

We created a Splunk group and follow this same basic line of thought.

Splunk is the user name, Splunk is the primary group that user belongs to. Splunk runs under user Splunk. Files, et al that need to monitored are either placed in a folder where Splunk group has read permissions or,.. We chmod folder the files are in to give the Splunk group those permissions.

You can get much more granular than that even.

0 Karma

sloshburch
Splunk Employee
Splunk Employee

I love it!
So that means you chmod AND chown to add the splunk group, right? Cause if a different group owns then splunk group would fall into the 'other' perms, right?

0 Karma

swagner1965
Path Finder

yeah, I am not a unix admin by a long shot but yes, that is it. Sorry I was not more clear.

0 Karma

sloshburch
Splunk Employee
Splunk Employee

Not the same, but a related post about how Splunk behaves when permissions get messed up: Permissions on monitored files

0 Karma

sheenarustomji
New Member

Agreed. running as non root user is best practice.

0 Karma
Get Updates on the Splunk Community!

Monitoring MariaDB and MySQL

In a previous post, we explored monitoring PostgreSQL and general best practices around which metrics to ...

Financial Services Industry Use Cases, ITSI Best Practices, and More New Articles ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Splunk Federated Analytics for Amazon Security Lake

Thursday, November 21, 2024  |  11AM PT / 2PM ET Register Now Join our session to see the technical ...