Getting Data In

Pros/cons of using Syslog-NG (or other syslog file receiver) vs. direct tcp/udp 514 to Splunk?

dglinder
Path Finder

For my installation (that I've inherited from multiple administrators), we have some events coming in through direct TCP/UDP 514 syslog events to Splunk, and others come in through a log file that Syslog-NG creates and updates. Can anyone add additional thoughts/comments to my pros/cons list below?

Benefits of Syslog-NG over Splunk for syslog reception:

Pros

  • Separate process from Splunk
  • Buffer incoming events if/when Splunk process not available
  • Syslog-NG config can be updated without restarting Syslog-NG process (SIGHUP - re-read config files)
  • kristian.kolb: The syslog server can structure the incoming logs into a directory structure on the file system, based on who is sending. This makes it easier to set up proper host, source, and sourcetype configuration with ordinary [monitor] stanzas, which in turn simplifies field extraction etc.

Cons

  • One more service/program to update and monitor
  • Possible (?) delay between receipt of syslog message and appearance of message in Splunk.

To me, the biggest benefits are the ability to restart my Splunk processes without loosing incoming events. But, since the restart is quite rare I'm concerned that the delay mentioned as the second con might be substantial (double-digit minutes or longer).

Is this delay concern warranted?

Are there additional cons that I overlooked?

Tags (3)
1 Solution

kristian_kolb
Ultra Champion

Why would you run into double-digit minute delays? That sounds pretty awful. Normally you'll have a delay that could be measured in single-digit seconds, in my experience.

In any case, I think you have the pro's listed correctly. One more addition is that the syslog server can structure the incoming logs into a directory structure on the file system, based on who is sending. This makes it easier to set up proper host, source, and sourcetype configuration with ordinary [monitor] stanzas, which in turn simplifies field extraction etc.

The alternative would be to have everything coming in to Splunk on a single port, and then try to structure it from there, with index-time transformations of sourcetypes etc. Or to set up syslog sending over several ports, e.g. a separate port for each host sending logs.. brrr .

/K

View solution in original post

yannK
Splunk Employee
Splunk Employee

pro:

  1. the syslog to file acts as a file buffer, so much better than relying on tcp queue and memory.
  2. You do not loose any events when splunk is down or restarting.

dglinder
Path Finder

Agreed, thanks!

0 Karma

vladx
New Member

It is also worth to mention if you are using Syslog-ng PE or Rsyslog as relay, you can use the builtin disk plus memory buffer function to increase reliability without saving raw logs in files.

0 Karma

Narj
Path Finder

We're using syslog-ng and are very happy with it... we already had central syslog receivers before splunk, so moving to syslog-ng has enabled us to split logs out nicely for the syslog light forwarder to pick up and stream to the indexer. This meant we could get splunk in with minimum effort (not having to reconfigure all network devices, update firewalls etc etc).

Splitting your logs out in such a fashion also means that you can use a blacklist/whitelist in Splunk in case the need arises to manage occasional issues with log volume exceeding the licence. There's also the fact that you can still access your logs (albeit more painfully) if Splunk dies for whatever reason.

dglinder
Path Finder

I hear your pain! Though by separating the Splunk duties from the Syslog duties, the main function of one was not impacted during the outage/restart of the other.

0 Karma

Narj
Path Finder

How typical is this... sing syslog-ng's praises and it stops working overnight after log rotation! O_o First problem in years!

kristian_kolb
Ultra Champion

Why would you run into double-digit minute delays? That sounds pretty awful. Normally you'll have a delay that could be measured in single-digit seconds, in my experience.

In any case, I think you have the pro's listed correctly. One more addition is that the syslog server can structure the incoming logs into a directory structure on the file system, based on who is sending. This makes it easier to set up proper host, source, and sourcetype configuration with ordinary [monitor] stanzas, which in turn simplifies field extraction etc.

The alternative would be to have everything coming in to Splunk on a single port, and then try to structure it from there, with index-time transformations of sourcetypes etc. Or to set up syslog sending over several ports, e.g. a separate port for each host sending logs.. brrr .

/K

View solution in original post

dglinder
Path Finder

Regarding "double-digit minute delays", I was purely speculating on a worst-case scenario. I can't imagine that case either, except possibly on a heavily loaded syslog-ng server.

One of the inherited designs is just as you described in the third paragraph (Splunk listening on several ports). The complications it adds is one of the main reasons I brought up this pro/con discussion. Thanks!

0 Karma

czanik
Engager

From the syslog-ng side, we put together the following doc: http://www.balabit.com/support/documentation/pdf/syslog-ng_splunk_deployment_guide_en.pdf

0 Karma

dglinder
Path Finder

Thanks, I'll check it out.

0 Karma

vladx
New Member

Link broken by the acquisition 😞

0 Karma
Register for .conf21 Now! Go Vegas or Go Virtual!

How will you .conf21? You decide! Go in-person in Las Vegas, 10/18-10/21, or go online with .conf21 Virtual, 10/19-10/20.