How can I use the CLONE_SOURCETYPE feature to clone an event that I need to modify and send to a 3rd party without indexing the cloned event as well? The intent is to index the original event and send a cloned (modified) version of the original event to a 3rd party.
One solution I have tested successfully is by using a Heavy Forwarder (full Splunk Enterprise instance, not Universal Forwarder) to do the clone_sourcetype before sending the events to the indexer.
All configurations below are applied to the heavy forwarder
1) create your tcpout stanza for forwarding data to your indexers
outputs.conf
[tcpout:clustered_indexers]
server = 10.10.10.1:9997,10.10.10.2:9997,10.10.10.3:9997
2) set _TCP_ROUTING in inputs.conf under the default stanza
inputs.conf
[default]
host = splunk-hfwd01
_TCP_ROUTING = clustered_indexers
3) since _TCP_ROUTING is defined in [default] it will get applied to all inputs
4) here is the input which we will clone events from
inputs.conf
[monitor:///opt/splunk/logs/test.log]
_TCP_ROUTING = clustered_indexers
disabled = 0
host = splunk-hfwd01
index = main
sourcetype = original
5) configure props stanza for the source and point it to transforms to apply CLONE_SOURCETYPE to this data
props.conf
[source::.../test.log]
TRANSFORMS-clone = clone_sourcetype
6) configure the transforms to assign a new sourcetype called cloned
to the cloned events.
transforms.conf
[clone_sourcetype]
CLONE_SOURCETYPE
= cloned
REGEX = .
7) configure the cloned sourcetype stanza in props where you will modify the cloned events ie: SEDCMD, LineBreaking etc. Also assign two transforms to these events to route them to the syslog output and to NOT route them out the TCP output. This will prevent a copy of the cloned event from being sent to your indexers as defined in _TCP_ROUTING 'clustered_indexers'
props.conf
[cloned]
SEDCMD-custom = s/[\n\r\t]/ /g
BREAK_ONLY_BEFORE = ((.+)\d+\/\d+\/\d+\s+\d+:\d+:\d+\s+([aApPmM]{2}))
TRANSFORMS-output = cloned_syslog,cloned_noTCP_routing
😎 configure the transforms for syslog routing and NO TCP routing. Here we route to a bogus destination for _TCP_ROUTING which does not exist since we only want to send these cloned events to the syslog output processor.
transforms.conf
[cloned_syslog]
DEST_KEY = _SYSLOG_ROUTING
FORMAT = send_syslog_to_3rdParty
REGEX = .
[cloned_noTCP_routing]
DEST_KEY = _TCP_ROUTING
FORMAT = bogus
REGEX = .
9) configure outputs
outputs.conf
[syslog:send_syslog_to_3rdParty]
priority = <13>
server = 10.10.10.25:514
timestampformat = <%b %e %H:%M:%S>
type = udp
[tcpout]
defaultGroup = bogus
10) the result of such configurations should send the original event to your index cluster and a cloned copy of the event to the 3rd party syslog receiver.
Caution
: Sending data to a single receiver can cause queues on the HF to block if that receiver goes down.
Hi @rphillips_splk,
While I know this is some time ago, I still find this very interesting!
You used 2 different routing types here - so I need to ask if this also could be applied on 2 TCP (different) connections, so the cloned version could also be send as TCP and not syslog?
Moreover - I'm in the situation, there will be an additional HF where you show the "syslog receiver" above, and the actual indexer - so basically route like this:
UF -> HF (clone) -> IDX
|-> HF -> IDX
Can this be done as smooth as above?
If so, how?
One solution I have tested successfully is by using a Heavy Forwarder (full Splunk Enterprise instance, not Universal Forwarder) to do the clone_sourcetype before sending the events to the indexer.
All configurations below are applied to the heavy forwarder
1) create your tcpout stanza for forwarding data to your indexers
outputs.conf
[tcpout:clustered_indexers]
server = 10.10.10.1:9997,10.10.10.2:9997,10.10.10.3:9997
2) set _TCP_ROUTING in inputs.conf under the default stanza
inputs.conf
[default]
host = splunk-hfwd01
_TCP_ROUTING = clustered_indexers
3) since _TCP_ROUTING is defined in [default] it will get applied to all inputs
4) here is the input which we will clone events from
inputs.conf
[monitor:///opt/splunk/logs/test.log]
_TCP_ROUTING = clustered_indexers
disabled = 0
host = splunk-hfwd01
index = main
sourcetype = original
5) configure props stanza for the source and point it to transforms to apply CLONE_SOURCETYPE to this data
props.conf
[source::.../test.log]
TRANSFORMS-clone = clone_sourcetype
6) configure the transforms to assign a new sourcetype called cloned
to the cloned events.
transforms.conf
[clone_sourcetype]
CLONE_SOURCETYPE
= cloned
REGEX = .
7) configure the cloned sourcetype stanza in props where you will modify the cloned events ie: SEDCMD, LineBreaking etc. Also assign two transforms to these events to route them to the syslog output and to NOT route them out the TCP output. This will prevent a copy of the cloned event from being sent to your indexers as defined in _TCP_ROUTING 'clustered_indexers'
props.conf
[cloned]
SEDCMD-custom = s/[\n\r\t]/ /g
BREAK_ONLY_BEFORE = ((.+)\d+\/\d+\/\d+\s+\d+:\d+:\d+\s+([aApPmM]{2}))
TRANSFORMS-output = cloned_syslog,cloned_noTCP_routing
😎 configure the transforms for syslog routing and NO TCP routing. Here we route to a bogus destination for _TCP_ROUTING which does not exist since we only want to send these cloned events to the syslog output processor.
transforms.conf
[cloned_syslog]
DEST_KEY = _SYSLOG_ROUTING
FORMAT = send_syslog_to_3rdParty
REGEX = .
[cloned_noTCP_routing]
DEST_KEY = _TCP_ROUTING
FORMAT = bogus
REGEX = .
9) configure outputs
outputs.conf
[syslog:send_syslog_to_3rdParty]
priority = <13>
server = 10.10.10.25:514
timestampformat = <%b %e %H:%M:%S>
type = udp
[tcpout]
defaultGroup = bogus
10) the result of such configurations should send the original event to your index cluster and a cloned copy of the event to the 3rd party syslog receiver.
Caution
: Sending data to a single receiver can cause queues on the HF to block if that receiver goes down.
Hi @rphillips_splk , is there a way to specify a certain source or host from the selected sourcetype to be cloned? thanks!
This was tested on the following versions:
HF: 6.4.1
Indexers: 6.6.1