Hello All,
We have a Splunk server setup for monitoring our Cisco WSA server using "Cisco Web Security Advanced Reporting" add-on, which is currently the only source sending files to this Splunk server.
The Splunk server has been filled to capacity and the partition where we store its logs is at 100%. So it seems like Like Rotation was never setup.
I read the info at this link below, but I now have a few questions regarding it.
----> http://docs.splunk.com/Documentation/Splunk/4.1.7/Admin/Howlogfilerotationishandled
Since Splunk does not have a built-in log rotation method, I assume we use the native Linux File rotation method on the server (*syslog-ng I believe..??) ? Is that correct?
# splunk --version
Splunk 6.2.2 (build 255606)
#
# syslog-ng --version
syslog-ng 2.0.9
# cat /etc/*release
SUSE Linux Enterprise Server 11 (x86_64)
VERSION = 11
PATCHLEVEL = 3
And I also read you can either Blacklist the compressed file format outputted from the log rotation or you can move the files to a new directory to prevent duplicate data from being produced.
blacklist = \.(gz|bz2|z|zip)$
What config file do I add the Blacklist configuration option to?
Also, what should I configure the syslog-ng to for the log rotation, is there a recommended configuration for this?
Thanks in Advance,
Matt
Syslog-ng is a program to read syslog from network and write it to disk. This just grabs what's sent to it on (default) udp/tcp 514 and creates a file on the local system with those contents.
Logrotate is a system utility that, as per its setup, will do the rotation for you.
There is no configuration recommendation for syslog-ng or logrotate, because there's no one size fits all strategy. Your retention of raw logs is generally driven by how far back you may need to go if you hadn't noticed something broke and will be configured mostly with logrotate.
That being said, if you have no other requirements I'd set it to 7 days and have them rotated nightly. Here's what appears to be some nice samples or examples.
My syslog-ng is set up to write files at /var/log/remote/"hostname"/log.txt
My Splunk (A universal forwarder, in this case) is set to only read log.txt in that file monitor stanza, and use the 4th segment as the hostname.
Logrotate is set to rotate log.txt daily, creating log1.txt, then gzip the remaining older ones and delete them when 7 days old. So I have log.txt (monitored by Splunk), log1.txt, log2.txt.gz, log3.txt.gz and so on. So, if I happen to have a broken (or non-started) UF or input, I have 7 days of stuff on disk I can puzzle around to backfill the missing data in Splunk.
I can supply configs for most of this later if you need them, but finding samples on the internet that match what you need is usually pretty easy once you know what to look for, and hopefully I just gave you that information!
My 2 cents, if the log rotation is clean ( I mean not using a logtruncate option that may cause duplicates), then it's not a problem to have splunk monitor the files and the rotated files.
As splunk has a mechanism to read the first lines of a file and detect if it's a new file or a rotated one.
The advantage is that if the file rotate before splunk had time to read the last events, then it will be able to continue on the rotated one.
Syslog-ng is a program to read syslog from network and write it to disk. This just grabs what's sent to it on (default) udp/tcp 514 and creates a file on the local system with those contents.
Logrotate is a system utility that, as per its setup, will do the rotation for you.
There is no configuration recommendation for syslog-ng or logrotate, because there's no one size fits all strategy. Your retention of raw logs is generally driven by how far back you may need to go if you hadn't noticed something broke and will be configured mostly with logrotate.
That being said, if you have no other requirements I'd set it to 7 days and have them rotated nightly. Here's what appears to be some nice samples or examples.
My syslog-ng is set up to write files at /var/log/remote/"hostname"/log.txt
My Splunk (A universal forwarder, in this case) is set to only read log.txt in that file monitor stanza, and use the 4th segment as the hostname.
Logrotate is set to rotate log.txt daily, creating log1.txt, then gzip the remaining older ones and delete them when 7 days old. So I have log.txt (monitored by Splunk), log1.txt, log2.txt.gz, log3.txt.gz and so on. So, if I happen to have a broken (or non-started) UF or input, I have 7 days of stuff on disk I can puzzle around to backfill the missing data in Splunk.
I can supply configs for most of this later if you need them, but finding samples on the internet that match what you need is usually pretty easy once you know what to look for, and hopefully I just gave you that information!
Hey Rich, thanks for the reply!
Ok cool, not sure why I was saying syslog-ng for the rotation... Sorry its been a while.
But, thanks for the link to the examples and you explanation. It is very much appreciated. I think you gave me enough to go off of to gets this configured right, so thanks again!
-Matt
One more question...
I read in the Splunk docs about how it handles log rotation, and its says it recognizes when a log was rotated, like /var/log/messages becoming .../messages1 and it will not read the rolled file a second time. What about if I use the logrotate config command that appends a date instead of just a number (*dateext). Does Splunk recognize that as well?
I'm assuming since it uses a CRC check to ID the files that it won't do that, but just wanted to be sure.
Thanks Again,
Matt
Yes, more or less correct. There's a handful of settings that controls it (mainly regarding how much of the beginning of the file to use to determine if it's new or not new), but you usually don't have to change those. (For reference, that's in inputs.conf and probably the most used setting is initCrcLength = <integer>
, but again, you shouldn't need to fiddle with that.)
In my case, I ONLY have splunk looking for log.txt in those folders (because that's what syslog-ng's writing), so any other file (like log1.txt or log.2016-05-02.txt) won't be read anyway.
For instance,
[monitor:///var/log/remote/10.128.0.*/log.txt]
host_segment = 4
sourcetype = syslog
index = network
That folder (well, ONE of the couple that match that wildcard) has log.txt, log.txt.1, log.txt.2.gz, log.txt.3.gz and so on back to 8. They're all ignored except log.txt.
There are tweaks I'm sure I could make, but it works fine. My main config for logrotate is just
/var/log/remote/*/*.txt
{
rotate 8
maxage 30
daily
missingok
compress
delaycompress
postrotate
invoke-rc.d syslog-ng reload > /dev/null
endscript
}
So, does that explain how those two pieces fit together? (And no, I have no idea why I did a 8, not 7. Or even 3. Just picked one. 🙂 )
Then on the syslog-ng side, there's a simple config to just write everything into folder/files when it comes in from the network.
source s_network_udp { udp( port(514)); };
source s_network_tcp { syslog( port(514) transport("tcp")); };
destination d_syslogs { file ("/var/log/remote/${HOST}/log.txt"); };
log {source(s_network_udp); source(s_network_tcp); destination(d_syslogs); };
Take a gander at that, it's fairly readable.
Does that help?
Yea, that's great, thanks for the explanations! That all makes sense...
And many thanks for the config examples, much appreciated!
-Matt