We are running Splunk 6.6.3 and have universal forwarders on our syslog servers. We are finding that some of the data gets behind for some of the hosts that the syslog server has files for.
Some of the files get very large throughout the day (the file for each host sending to the syslog server cycle into a new file daily). At least 3 of the files get to a point where Splunk is enqueuing the files into Batch mode. These files are mostly from our InfoBlox servers or our Panorama for our firewalls.
The syslogs servers are not being over taxed, so I should be able to adjust some numbers higher to allow for better thruput, but I'm not sure what the best setting changes would be.
Thanks.
some options to improve situation :
some options to improve situation :
Changing the syslog config to write hourly files (incl. a timestamp in the file) was something I was going to suggest as well. That keeps the overall file size smaller.
For files getting created daily (one per day), especially large ones, if you could change syslogs/rsyslog to write them like hourly or once in a couple of hours, that could help reduce the size of the files and also process them faster.
I also assume you have appropriate recommended settings for ulimit number of open files on that server.
yes we have changed the ulimit number of open files.
I will see if the team is willing to change the setting to roll hourly.
As @lakshman239 said, you should be using maxKBps=0
on your UFs. The other common thing that happens is poor housekeeping of the files on the UF. If you have hundreds, you are fine but if you have thousands of files in your directory, EVEN IF YOU HAVE ALREADY FORWARDED THEM OR THEY DO NOT MATCH YOUR STANZA, Splunk will get slower and slower and slower. Make sure that you delete or move files such that you are < 1000 in that directory.
I will try changing the maxKBps to 0 so it is unlimited. Not sure if the servers can handle it though.
We only have a couple of hundred files in the directory that it is picking up on but some get very large.
There are no old files in the directory as they are cycled off to a different directory nightly, so new files for each devices get created when new data comes in after midnight.
Are you rotating the logs hourly to keep them from getting too big? You should be.
One option would be to look at limits.conf and increase maxKBps on your UF , so more chunks of data can go from UF to indexers and help reduce delays.
Would making that change in the limits.conf prevent the files from going into batch mode? Or would it allow for reading move from the files in batch mode at a time?
That change will allow your UF to forward to the indexers at a faster rate, hence speeding up the entire pipeline. It doesn't necessarily improve the way big files are handled.