Getting Data In

Issue with syslog data getting behind when read from our syslog server with a universal forwarder

jeffbat
Path Finder

We are running Splunk 6.6.3 and have universal forwarders on our syslog servers. We are finding that some of the data gets behind for some of the hosts that the syslog server has files for.

Some of the files get very large throughout the day (the file for each host sending to the syslog server cycle into a new file daily). At least 3 of the files get to a point where Splunk is enqueuing the files into Batch mode. These files are mostly from our InfoBlox servers or our Panorama for our firewalls.

The syslogs servers are not being over taxed, so I should be able to adjust some numbers higher to allow for better thruput, but I'm not sure what the best setting changes would be.

Thanks.

0 Karma
1 Solution

maraman_splunk
Splunk Employee
Splunk Employee

some options to improve situation :

  • make the logfile contain date and hour in syslog config + crcsalt= (work together)
  • use max_days_ago to not scan older files (dont keep them forever in the directory the uf scan)
  • make sure you dont limit bandwith with setting maxKBps=0
  • increase pipelines if you have the ressources for

View solution in original post

0 Karma

maraman_splunk
Splunk Employee
Splunk Employee

some options to improve situation :

  • make the logfile contain date and hour in syslog config + crcsalt= (work together)
  • use max_days_ago to not scan older files (dont keep them forever in the directory the uf scan)
  • make sure you dont limit bandwith with setting maxKBps=0
  • increase pipelines if you have the ressources for
0 Karma

jeffbat
Path Finder
  • Are you saying to change the way the logfiles create to make them contain date/hour in the filename? They only get created daily so that would not really change anything
  • max_days would not do anything as we only get the same days files for devices in the directory (all files are moved out nightly and new files created when new data comes in for a device after midnight)
  • will be trying the maxKBps=0
  • the multiple pipelines was a thought I had, I know it uses the same bandwidth for each pipeline (so if set to 256 then it would use 512 with 2 pipelines) but wasn't sure, would each pipeline work on its on rotation of the indexers it sends to or would both pipelines send to the same indexer? (i.e. if there are 4 indexers it sends to would both send to indexer 1 until it switches to indexer 2, etc.)
0 Karma

FrankVl
Ultra Champion

Changing the syslog config to write hourly files (incl. a timestamp in the file) was something I was going to suggest as well. That keeps the overall file size smaller.

0 Karma

lakshman239
Influencer

For files getting created daily (one per day), especially large ones, if you could change syslogs/rsyslog to write them like hourly or once in a couple of hours, that could help reduce the size of the files and also process them faster.

I also assume you have appropriate recommended settings for ulimit number of open files on that server.

0 Karma

jeffbat
Path Finder

yes we have changed the ulimit number of open files.

I will see if the team is willing to change the setting to roll hourly.

0 Karma

woodcock
Esteemed Legend

As @lakshman239 said, you should be using maxKBps=0 on your UFs. The other common thing that happens is poor housekeeping of the files on the UF. If you have hundreds, you are fine but if you have thousands of files in your directory, EVEN IF YOU HAVE ALREADY FORWARDED THEM OR THEY DO NOT MATCH YOUR STANZA, Splunk will get slower and slower and slower. Make sure that you delete or move files such that you are < 1000 in that directory.

jeffbat
Path Finder

I will try changing the maxKBps to 0 so it is unlimited. Not sure if the servers can handle it though.

We only have a couple of hundred files in the directory that it is picking up on but some get very large.

There are no old files in the directory as they are cycled off to a different directory nightly, so new files for each devices get created when new data comes in after midnight.

0 Karma

woodcock
Esteemed Legend

Are you rotating the logs hourly to keep them from getting too big? You should be.

0 Karma

lakshman239
Influencer

One option would be to look at limits.conf and increase maxKBps on your UF , so more chunks of data can go from UF to indexers and help reduce delays.

0 Karma

jeffbat
Path Finder

Would making that change in the limits.conf prevent the files from going into batch mode? Or would it allow for reading move from the files in batch mode at a time?

0 Karma

FrankVl
Ultra Champion

That change will allow your UF to forward to the indexers at a faster rate, hence speeding up the entire pipeline. It doesn't necessarily improve the way big files are handled.

0 Karma
Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...