Why SC4S over a generic “syslog servers tier”
1. It is Splunk’s best practice today
Splunk Validated Architectures calls SC4S the current best practice for syslog and says the UF or HF tailing files on a generic syslog server is widely used but no longer recommended in favour of SC4S. It also says direct TCP or UDP inputs to forwarders or indexers are not recommended for production.
2. Better distribution and performance
HEC streaming directly to indexers produces more even data distribution than AutoLB on forwarders. Even distribution directly improves search performance and storage balancing.
3. Higher data quality from day one
SC4S assigns correct sourcetypes and metadata for each technology so your Splunk add ons and CIM based content work consistently. The SVA also notes that SC4S reduces or removes the need to manage add ons on indexers.
4. Reliability features you can evidence
Disk buffers protect against short downstream issues. Guidance gives a conservative sized figure of about sixty thousand events per second per server with disk buffering enabled, which is plenty for most estates. Without disk buffering the ceiling is much higher, but buffering is strongly recommended to avoid loss.
5. Operational simplicity and supportability
SC4S is containerised, uses environment variables rather than artisanal config, has a published vendor catalogue, and is Splunk supported with active releases. It is a repeatable pattern your team can run, not a consultancy craft project.
--------------------------------
What a generic syslog tier really means, and its drawbacks
The old pattern is syslog servers writing to disk with a UF or HF tailing files into Splunk. Splunk documents this as still supported but explicitly no longer the best practice because it adds complexity, creates distribution challenges, increases change risk during agent restarts, and often leads to catch all sourcetypes that harm analytics.
Sending raw syslog directly to forwarders or indexers is strongly discouraged for production due to loss during restarts and poor resilience. This is not my opinion, it is Splunk’s own.
Would you agree to the above?
Kind regards,
Dan
Ad.1 Well, there are several syslog architectures in SVAs. I don't know why but Splunk completely ignores a non-SC4S syslog receiver sending to HEC. The "old" version with intermediate files has drawbacks (the necessity to manually rotate the files, less posibilities to enrich data) but also has pros - a built-in buffering.
Ad.2 No. Modern forwarder with async-lb gives you a very good distribution of events. And HEC on its own doesn't guarantee anything. Especially if you go through a LB - it all depends on how your LB works, what batch size you use and so on.
Ad.3 Not necessarily. SC4S can't help you with types of data it doesn't know. You still have to do the legwork. (and I'm not a big fan of the syslog-ng but that's my personal taste). Also - you can offload timestamp recognition in some cases and typically syslog stream is already broken into single events but that's it. Rewriting index-time operations into SC4S doesn't make much sense so you still need add-ons on your HEC receivers when they are needed.
Ad.4 To be fully honest I don't quite get this point. Anyway, with disk-based buffering you're always limited by disk speed. Especially if you want to do it right (ensure data consistency and reliability). And in the end... you can never be sure. That's complicated.
Ad.5 Containerization here means only that Splunk doesn't have to build different packages for different systems. To be honest, I'm not a huge fan of distributing software this way. It limits debugging possibilities and forces you into some possibly questionable choices about the environment the software is running in. And its repeatability is limited to the situations which match the predefined types of data (and often with predefined methods of generating this data).
I'm not quite sure what your post was supposed to be about? SC4S is one of the possible solutions for receiving syslog. Some people like it, some don't. It has its pros and cons.
It's "just" a prepackaged syslog-ng with extra steps.