I do have experience on setting up a clustered environment but this request is new to me. Situation: There are 2 different clustered environment "Site1" and "Site2", they do have their own Independent Environment. Site1 data are shared to Site2 ,but Site 2 will not share data to Site1 (One way replication).
They also want it to be in the Indexer level since there are 500+ universal forwarders sending data to Site1 , and that's a tedious work for them setting up the firewall, connection and etc.
Here are the listed requirements:
2 different Clustered environment, they have their own ClusterMaster, DeploymentServer, LicenseMaster, SearchHead etc.
Replicate/Forward data from Site 1 to Site 2 [ONE WAY].
Replication or Forwarding are setup on Indexer Level.
Since these are two different cluster, they will not replicate to each other . What you can do is create to set of apps containing outputs.conf.
org_site1_forwarder_outputs - containing site1 IDX settings/servers
org_site2_forwarder_outputs - containing site2 IDX settings/servers
Then just deployed the apps to the forwarders you wanted to get data. If the forwarder should only send to site1 IDX, then deploy only the app "org_site1_forwarder_outputs", else if the forwarder should only send to site2 IDX, then deploy only the app "org_site2_forwarder_outputs." If the forwarder should send data to both, then deploy both apps "org_site1_forwarder_outputs" and "org_site2_forwarder_outputs". Utilized your Deployment server, and create server class to easily segregate your apps you wanted to deploy.
Splunk's indexer clustering capability doesn't allow for this kind of unidirectional "clustering." It's meant to be bi-directional, or site restricted in regards to replication / search factor.
If you want to do this, the most direct method will be to index and forward from the "Site 1" cluster to "Site 2"'s indexers... ( https://docs.splunk.com/Documentation/Splunk/8.0.1/Forwarding/Routeandfilterdatad#Perform_selective_...) This would mean data is indexed in site1, and then forwarded to indexers in site2. However, you're going to be incurring twice the ingestion cost for this since you are indexing the data twice. So this isn't a good solution as I see it, and probably your customer wont as it means their 1tb license now needs to be 2tb.
You could potentially do a Single Multisite Cluster ( site1 and site2,) and adjust the replication so that site 2 is only a replicated copy of site1.. ( site_replication_factor=origin:1,site2:1,total:2). This would eliminate the double indexing charges incurred with index and forward.
Can they do hybrid search, or federated search, from cluster 2 to cluster1? Or is this about getting a copy of the data for archiving purpose? Do you have S3 compatible storage available? This would be a good use case for Smartstore. Whats the real use case/ business case here for them to have unidirectional replication?
First proposed solution is " This would mean data is indexed in site1, and then forwarded to indexers in site2.". This happens on the Forwarders ? They are trying to avoid tons of job(Opening port, connectivity and firewalls) on all 500+ forwarders.
Since they do have different environment they also have different license.
Second proposed solution is "Single Multisite Cluster ( site1 and site2,) ". Site1 and Site2 has there own cluster master, means they are not connected. As far as i understand replication factor can only be applied for those site(s) under ONE cluster master.