Would anyone have any recommendations for forwarding events from physically isolated networks to a main network and thus providing a "single pane of glass"?
The networks must be physically isolated due to security requirements. Data diode connections are approved. We cannot implement firewalls, thus supporting TCP connections (and typical Splunk forwarding protocols).
Here is a reference diagram of the logical architecture. We are looking at utilizing data diodes, which then require UDP connections, which of course, limits our options. We know how to get the traffic across the data diodes from a network configuration standpoint, so it's a matter of what data and how the data is structured so it can be processed correctly by the upstream Splunk Indexer that is the big question.
Our biggest issue lies in the fact that UDP ingest on the upstream Splunk server only sees the Heavy Forwarders as the source vs. the original endpoints. We've searched at other forum posts with no luck on how to adapt them if they are even adaptable for this type of scenario, given we are ingesting Windows, Linux, and Syslog from each isolated network. Example: https://community.splunk.com/t5/Getting-Data-In/Keeping-Host-data-when-using-Heavy-Forwarder/m-p/232...
What happens between the Heavy Forwarders and the upstream Splunk Indexer is where we need help. Any and all creative ideas are welcome! For example:
We are currently running Splunk Enterprise v8.2.3, on premise.
Thank you in advance!
first the most important question. Can you lost unknown amount of events from those zones when you are sending those to your central splunk service? As data diodes use UDP you will be lost some event time by time without knowing it!
If this is not allowed then forget the central service idea and collect data locally and set local splunk instances on those zones.
If you could lose some events, then I probably try to use props+transforms on HF and modify data streams by adding own metadata fields into the feed, where I put those information. Then I will configure HFs to use syslog/udp to send events to the central HF, which take care of use those own metadata fields and move that information to the correct fields like host, source, sourcetype etc. before forwarding events to indexers.
You noticed some of the caveats of the diode setup.
Since you're limited to UDP transport you can't use any protocol that is connection-oriented so effectively you're limited to simple syslog.
Your limitations with this are of course that:
1) You're prone to data loss
2) The size of messages that you can pass is limited to the size of a single UDP datagram.
There is also a pretty universal limitation of Splunk (which is not a log-management solution as such - it's just a data processing platform that people typically use for log management ;-)) that you don't really get much connection metadata with the events themselves (that's why - for example - sc4s inserts additional indexed fields to events).
So for syslog data it's probably easiest to set up some kind of a syslog-based forwarder (you probably can do that on either rsyslog or syslog-ng) on the "inside" of your protected network adding some form of metadata information (at least source IP) to the original message and forwarding it "outside" where it would be collected by another syslog which would "unpack" this metadata and send it to - for example - HEC connector on HF/indexer.
Again - you're limited by the size of the single UDP datagram so remember that your event data can get truncated in such processing if it's close to the limit already, before adding this additional metadata payload.
Windows is more tricky because:
1) Most windows-related splunk solutions rely on the TA for windows which produce events in one of two predefined formats
2) The windows events can be quite big (especially if rendered as XML
You can of course use some form of windows event log->syslog exporting solution (like solarwinds kiwi syslog server) but then you'd get your data in a custom format which you'd have to parse yourself. And the typical caveat about event size still applies.
Alternatively you could - if your diode provides some form of flat file transport possibility - export windows events rendered as XML to a file then move this file "outside" and ingest the xml events from the file - this way you should be able to overcome the size limitation. On the "inside" side you could use Windows Event Forwarding mechanism to gather event logs from multiple hosts onto a single EventLog collector from which you'd export them to said file (WEF is much easier to use with AD domain but can be used with domainless environment as well but is annoyingly tricky to configure in that setup).