Recently installed the Splunk App and Add-on for Unix and Linux in our Splunk environment. We have a distributed search environment that includes clustered indexers, multiple search heads, and heavy forwarders. Some of our servers are running syslog-ng to collect logs from other systems, and all of those syslog-ng logs are stored on our indexers under /var/log/syslog-ng. The other servers are running rsyslogd.
My question is this - I added syslog-ng to the blacklist on the /var/logs section of inputs.conf -
whitelist = (\.log|log$|messages|secure|auth|mesg$|cron$|acpid$|\.out)
blacklist = (syslog-ng|lastlog|anaconda\.syslog)
index = os
disabled = false
I did this to prevent the Unix add-on from scraping syslogs that were already being indexed by another app. (the indexers are writing their internal logs to /var/log/messages).
Problem is that none of my servers are getting anything from /var/log/messages indexed except for my single Linux server that is not actually part of the Splunk environment. That server is the same OS/in the same environment, but is being used as a monitoring and puppet server. All servers have the same version of inputs.conf. The monitoring/puppet server is running rsyslogd and writing output to /var/log/messages, but as mentioned, other servers in the environment are also running rsyslogd and all are writing their local logs to /var/log/messages. What am I doing wrong that is preventing these other servers from indexing /var/log/messages? It looks like the only thing that is actually being indexed on these other servers at this point is /var/log/cron.
Ok, I just realized why it isn't scraping messages - it's because the splunk user doesn't have permission to read /var/log/messages! Which is how it is supposed to be! Now, do I make my server insecure by making messages readable by splunk?
To get the logs into Splunk (or any other big data or log management solution), you'll need to be able to read them, and that means you're eventually going to have to trust some process to read them and transmit them. That doesn't mean you have to have splunk do it; it's not uncommon to see a whole lot of syslog routing being done before the first forwarder.