Which is preferable, having a heavy forwarder/deployment server in the DMZ or opening the Splunk ports from the existing DMZ servers (30 servers) to communicate with Splunk in production? If we build a new HF/Deployment Server in the DMZ, we will not have to worry about opening the Splunk ports from the other DMZ servers BUT we would want to have RDP access (opening the RDP port) to this new deployment server from our desktops. If we choose not to have a deployment server in the DMZ, there is no need for us to open additional RDP ports. Is there a recommendation/best practice to follow for this?
Having a heavy forwarder with deployment services running in the DMZ means you only have a single interface between your DMZ and your internal network. This means the firewall rules are much simpler to manage and also to monitor. The risk is that you need to create an inbound firewall connection from the DMZ to your internal network to allow the data to be sent to an indexer. For RDP, this is not normally considered an issue, as it is outbound only (from your network to DMZ) and so should not allow any traffic to enter from the DMZ. Just make sure the firewall rules are clearly defined as uni-directional.
One other option to consider, is to also use this heavy forwarder as an indexer to store your data - you could then add an outbound rule from your internal indexer to your DMZ indexer. This would allow you to search the DMZ data from inside your network without needing to open any inbound connections. This option is only realistic if the data being stored in the DMZ is not sensitive.
how could a configuration look like where i pull data from an DMZ HF?
The only settings i know are push.
Best Regards Michele
Just to clarify - Splunk forwarders / heavy forwarders can only push data to another Splunk instance - this is explicitly defined in the name "forwarder". They cannot pull data from another Splunk instance. You cannot "pull" data from a heavy forwarder.
The only time you use Splunk to "pull" data is in the following scenarios
So if you want to have data remaining in the DMZ that you can "pull" - you would have to deploy an indexer to store this data in the DMZ, then you use a remote search instance to "pull" results as and when required.