Getting Data In

Struggling with setting up SC4S in an air-gapped environment – looking for the best deployment approach

kn450
Explorer

 

Hi,

I’m trying to use Splunk as a log aggregation solution, and eventually as a SIEM. I have three industrial plants that are completely air-gapped, with no permanent internet access.
The idea is to deploy a syslog server at each plant to collect logs locally and then forward them to a central Splunk installation.

Any component or software will be:

  • downloaded or installed during a temporary internet connection (via a cellular modem),

  • then moved into a fully air-gapped production environment.

I’ve reviewed SC4S (Splunk Connect for Syslog) through the official Splunk documentation and several videos.
In theory, it looks like a very powerful and well-designed solution. However, in practice, I find the documentation quite difficult to follow, especially when considering:

  • an air-gapped environment,

  • no internet connectivity,

  • and very large log volumes (high EPS / high throughput).


My main questions are:

  1. From a practical and low-complexity perspective:

    • Is it better to use SC4S, or

    • to simply deploy syslog-ng or rsyslog (open-source solutions) on Linux servers at each plant and forward logs to the central Splunk instance?

  2. Best deployment model for an air-gapped industrial environment:

    • SC4S standalone at each site?

    • Simple syslog collectors with forwarding?

    • Which option is more stable and easier to operate long term?

  3. Given a preference for open-source solutions:

    • Is relying on syslog-ng / rsyslog considered a professionally acceptable approach with Splunk?

    • Or has SC4S effectively become the best practice that should not be avoided?

  4. From an operating system perspective:

    • Which is better suited for handling very large volumes of log data?

      • Ubuntu Server

      • CentOS / Rocky Linux / AlmaLinux

    • Which is more stable and easier to maintain in a 24/7 production environment?

  5. The end goal:

    • A stable solution

    • Simple to operate

    • Capable of handling very large data volumes

    • Suitable for air-gapped industrial environments

    • Without introducing excessive operational complexity

I’m trying to find a balance between simplicity and best practices.
I want to use Splunk correctly, but at the same time avoid introducing operational complexity that exceeds the team’s current capabilities.

Any advice or real-world experience would be greatly appreciated.

Labels (3)
Tags (1)
0 Karma

PickleRick
SplunkTrust
SplunkTrust

SC4S _is_ open source. It's just syslog-ng with extra steps.

The question about syslog-ng/SC4S/rsyslog usually boils down to previous experience and personal preferences (I'm a big fan of rsyslog but that's just me).

I'm not 100% sure how SC4S handles writing to files (by default it's meant to push events to HEC input(s)).

Anyway, your main challenge I think will not be the particular solution but the overall process of moving the data since if you have an air-gapped environment(s) and big volume of data. Of course air gap means that you will have to save the data to files in one site, then move the files on some remote media to another site and from there ingest them to Splunk. That will yield a significant latency on your events.

Writing the events to files with a predefined naming  scheme should be relatively easy in both syslog-ng/rsyslog (again I'm not sure if SC4S can do it easily). Unless you want to modify the events before writing them, this part will be fairly straightforward. You might want to think over your overall process to automate as much as possible (like (un)mounting of the movable storage, copying the files...). And you will have to struggle with preventing duplicates.

And don't forget about retention policy and file rotation at the source site(s).

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

[Puzzles] Solve, Learn, Repeat: Character substitutions with Regular Expressions

This challenge was first posted on Slack #puzzles channelFor BORE at .conf23, we had a puzzle question which ...

Splunk Community Badges!

  Hey everyone! Ready to earn some serious bragging rights in the community? Along with our existing badges ...

[Puzzles] Solve, Learn, Repeat: Matching cron expressions

This puzzle (first published here) is based on matching timestamps to cron expressions.All the timestamps ...