Solved: Data cloning - is it a exact clone

MoniGreeth · ‎02-27-2013

Hi All,

In splunk documentation it is mentioned as follows,

"Data cloning
To perform data cloning, specify multiple target groups, each in its own stanza. In data cloning, the forwarder sends copies of all its events to the receivers in two or more target groups. Data cloning usually results in similar, but not necessarily exact, copies of data on the receiving indexers."

AS you see in the bold text, it is mentioned that data is not exact, which is bugging me, will UF forward exact copy of events to the configured indexers or their will be some difference?

for example: say we have two indexer IDX1 and IDX2, configured for cloning in UF, and if we get 1000 events, will both IDX1 and IDX2 will have 1000 events each or their might be some diff?
If anyone can share some information, to get some clarity on this.

Thanks
Moni

yannK · ‎08-28-2014

With Cloning, the same stream of events will be sent to 2 indexers. So they "should" be identical.

However because the events are parsed on the indexer, they may have different rules and props.conf, or timezone and decide to parse the events differently (by example do a different timestamp detection, have extra parsing rules, line breaking)

If you want identical formatted data, use the cluster replication feature, the copy will be at the "bucket" level after indexing, therefore they will be identical.

Remark on cloning: if you have outputs.conf settings like blockOnCloning=false and dropClonedEventsOnQueueFull then if one of the indexer is not reachable, the forwarder can decide to not wait for it, and sent to the remaining one (for the high availability use case when you do want the data event when a node is dead)
http://docs.splunk.com/Documentation/Splunk/6.1.3/Admin/Outputsconf

View solution in original post

the_wolverine · ‎02-16-2015

Sample configuration?

outputs.conf:
[tcpout:tcpout_group1]
server=10.1.1.197:9997,10.1.1.198:9997
autoLB=true

[tcpout:tcpout_group2]
server=myhost1.splunk.com:9997,myhost2.splunk.com:9997
autoLB=true

inputs.conf:
[monitor:///varl/log/messages]
index=test
sourcetype=mytest
_TCP_ROUTING = tcpout_group1,tcpout_group2

yannK · ‎08-28-2014

With Cloning, the same stream of events will be sent to 2 indexers. So they "should" be identical.

However because the events are parsed on the indexer, they may have different rules and props.conf, or timezone and decide to parse the events differently (by example do a different timestamp detection, have extra parsing rules, line breaking)

If you want identical formatted data, use the cluster replication feature, the copy will be at the "bucket" level after indexing, therefore they will be identical.

Remark on cloning: if you have outputs.conf settings like blockOnCloning=false and dropClonedEventsOnQueueFull then if one of the indexer is not reachable, the forwarder can decide to not wait for it, and sent to the remaining one (for the high availability use case when you do want the data event when a node is dead)
http://docs.splunk.com/Documentation/Splunk/6.1.3/Admin/Outputsconf

sanjeevdixit · ‎08-28-2014

Hi,
Did you get any answer to this question?

Data cloning - is it a exact clone

Developer Spotlight with Brett Adams

Index This | What can you do to make 55,555 equal 500?

Say goodbye to manually analyzing phishing and malware threats with Splunk Attack ...