I have two Splunk search heads and indexers. Currently, all of the data sourcetypes get indexed on primary Splunk instance, I'm looking to split this and index specific sourcetypes on a second Splunk instance.
I'm currently trying to take a data feed with the sourcetype of "people" and use outputs.conf, props.conf, and transforms.conf to do so but am not able to get it to work. Here is my current configuration in /opt/splunk/etc/system/local:
SOURCE_KEY = MetaData:Sourcetype
REGEX = people
DEST_KEY = _TCP_ROUTING
FORMAT = s2
TRANSFORMS-routing = forward_cdr_to_s2
I was hoping this would take the "people" sourcetype as specified in the regex expression and source_key and route it to the Splunk server specified in [tcpout:s2] with the same index/sourcetype in the secondary Splunk instance. Any thoughts? Thanks in advance!
Is there a reason that you want to split data to a different indexer?
Normally, the approach is to send data to all indexers; by default, it is "load balanced" so that each indexer gets about half of the data (if you have two indexers). So the oututs.conf on the forwarders would look like (assuming that the indexers are listening on port 9997):
and both indexers would be configured identically.
This "load balanced" approach can improve performance for both indexing and searching. In addition, if one indexer is offline for any reason, the forwarders will automatically send data to the surviving indexer. So this is also a more resilient configuration.
(Why did I put "load balanced" in quotes? Because the switching between indexers is not really based on load, but that isn't relevant to this discussion.)
Finally, if you really want to send all data for a sourcetype to a particular indexer, you can do that at parsing time using transforms.conf, as discussed in other answers. However, I would also ask: why not use universal forwarders to collect the data?
Universal forwarders will provide better performance than heavy forwarders (as the name implies).
And regardless of the type of forwarder (HF or UF), the routing that you want can be done most efficiently an input time:
outputs.conf on the forwarder
defaultGroup = s1
server = indexer1IP:9997
server = indexer2IP:9997
Last but far from least: If you are trying to control "who sees what data," I recommend that you send the data to different indexes, and then set the user roles so that a user can only see the index(es) that you allow. This is a best practice regardless of the number of indexers.